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Abstract 


The  major  objectives  of  the  report  were  to  identify  and  review  the  field  of  image  fusion  and 
contributing  technologies  and  to  recommend  systems,  algorithms  and  metrics  for  the  proposed 
SIHS  TD  Vision  SST  fusion  test  bed.  A  search  of  the  relevant  literature  was  conducted  using  the 
relevant  databases  and  approximately  150  papers  of  primary  utility  were  identified  for  review.  The 
report  provides  an  in-depth  introduction  to  fusion  hardware  and  software  technologies  and 
evaluation  metrics.  The  effort  focused  on  identifying  promising  sensing  fusion  technologies  that 
could  be  utilized  by  the  Soldier’s  Integrated  Helmet  System  Technology  Demonstrator  (SIHS  TD). 
The  SIHS  TD  Vision  Sub-System  Team  plans  to  develop  a  fusion  test  bed  in  the  near  term  to 
quantify  dismounted  soldier  perfonnance.  The  systems  examined  in  this  project  were  projected  to 
be  mature  and  compatible  with  man  packed  applications  by  the  year  2007.  The  literature  review 
identified  considerable  technological  advancements  in  sensor  size  reduction,  power  demand 
reductions,  and  increases  in  resolution.  The  report  analysed  select  sensor  systems  for  their 
suitability  in  the  fusion  test  bed  based  on  sensor  form  factors,  detector  resolution,  and  real  time 
performance.  Recommendations  on  what  sensors  to  include  in  the  fusion  test  bed  are  included. 

The  report  provides  an  in-depth  introduction  into  image  fusion  approaches.  A  list  of  potential 
fusion  algorithms  were  identified  and  reviewed.  Recommendations  on  what  fusion  algorithms 
should  be  examined  in  the  fusion  test  bed  are  provided.  A  number  of  subjective  and  objective 
fusion  evaluation  approaches  and  metrics  were  proposed  in  the  literature  to  quantify  and  qualify 
image  fusion  performance.  Recommendations  on  what  valid  fusion  metrics  should  be  utilized  in 
the  fusion  test  bed  are  provided.  Improvements  to  fusion  subjective  evaluation  approaches  are  also 
detailed.  Finally,  summary  suggestions  for  the  Vision  SST  fusion  test  bed  are  provided. 
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Resume 


Le  rapport  a  principalement  pour  objectifs  de  determiner  et  d’examiner  le  domaine  de  la  fusion 
d’ images  et  des  technologies  d’appui,  ainsi  que  de  recommander  des  systemes,  des  algorithmes  et 
des  mesures  pour  le  banc  d’essai  de  fusion  de  Tequipe  des  sous-systemes  de  vision,  dans  le  cadre 
de  la  demonstration  de  technologie  -  casque  integre  pour  soldat  (DT  -  SIHS).  Une  recherche  de  la 
documentation  pertinente  effectuee  dans  les  bases  de  donnees  appropriees  a  permis  de  trouver 
environ  150  documents  d’utilite  immediate  pour  l’examen.  Le  rapport  presente  en  detail  les 
technologies  et  les  mesures  d’ evaluation  du  materiel  et  du  logiciel  de  fusion.  Les  travaux  visent 
essentiellement  a  determiner  les  technologies  prometteuses  de  fusion  et  detection,  qui  pourraient 
etre  utilisees  dans  le  cadre  de  la  DT  -  SIHS. 


L’equipe  des  sous-systemes  de  vision  de  la  DT  -  SIHS  planifie  le  developpement  a  court  terme 
d’un  banc  d’essai  de  fusion  permettant  de  quantifier  le  rendement  des  soldats  debarques.  Les 
systemes  examines  dans  le  cadre  de  ce  projet  devraient  etre  au  point  et  compatibles  avec  les 
applications  portatives  d’ici  2007.  L’examen  de  la  documentation  a  fait  ressortir  des  progres 
technologiques  considerables  en  matiere  de  reduction  de  la  taille  des  capteurs,  de  reduction  de  la 
puissance  consommee  et  d’ augmentation  de  la  resolution.  Le  rapport  analyse  des  systemes  de 
capteurs  selectionnes  pour  etablir  leur  adaptability  au  banc  d’essai  de  fusion  en  fonction  des 
facteurs  de  forme  des  capteurs,  de  la  resolution  des  detecteurs  et  du  rendement  en  temps  reel.  Des 
recommandations  sont  incluses  quant  aux  capteurs  a  integrer  au  banc  d’essai  de  fusion.  Le  rapport 
presente  en  detail  des  methodes  de  fusion  d’images.  Une  liste  des  algorithmes  de  fusion  possibles 
est  dressee  et  examinee.  Des  recommandations  portent  sur  les  algorithmes  de  fusion  qu’il  y  a  lieu 
d’examiner  pour  le  banc  d’essai  de  fusion.  Un  certain  nombre  de  methodes  et  de  mesures 
d’ evaluation  subjective  et  objective  de  la  fusion  sont  proposees  dans  la  documentation  en  vue  de  la 
quantification  et  de  la  qualification  du  rendement  de  fusion  d’images.  Des  mesures  de  fusion 
valides  sont  recommandees  pour  le  banc  d’essai  de  fusion.  Des  details  sont  egalement  foumis  sur 
les  ameliorations  qu’il  y  a  lieu  d’apporter  aux  methodes  devaluation  subjective  de  la  fusion.  Enfin, 
des  suggestions  sommaires  sont  presentees  pour  le  banc  d’essai  de  fusion  de  Tequipe  des  sous- 
systemes  de  vision. 
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Executive  Summary 


The  major  objectives  of  the  report  were  to  identify  and  review  the  field  of  image  fusion  and 
contributing  technologies  and  to  recommend  systems,  algorithms  and  metrics  for  the  proposed 
SIHS  TD  Vision  SST  fusion  test  bed. 

A  search  of  the  relevant  literature  was  conducted  using  the  following  databases:  Psyclnfo,  National 
Technical  Information  Service  (NTIS),  SPIE,  IEEE,  Optical  Engineering,  GlobalSpec,  Defence 
Research  Reports  and  the  World  Wide  Web  (www).  Keywords  included  combinations  of  “Image 
Fusion”  and  “Sensor”,  “Hardware”  and  “Multi-sensor”.  When  a  keyword  yielded  an 
unmanageable  (too  many)  number  of  references,  the  researcher  systematically  added  additional 
keywords  to  refine  the  search.  In  general,  this  process  produced  many  irrelevant  references. 
“Snowball”  techniques  starting  with  known  authors  and  papers  and  following  up  their  references  to 
other  work  tended  to  produce  more  fruitful  results.  At  the  end  of  the  search,  approximately  150 
papers  of  primary  utility  were  identified  for  review. 

The  report  provides  an  in-depth  introduction  to  fusion  hardware  and  software  technologies  and 
evaluation  metrics.  Factors  affecting  performance  are  introduced.  The  review  also  identified 
development  trends  for  various  existing  and  emerging  sensor  technologies,  fusion  approaches  and 
evaluation  metrics.  The  effort  focused  on  identifying  promising  sensing  fusion  technologies  that 
could  be  utilized  by  the  Soldier’s  Integrated  Helmet  System  Technology  Demonstrator  (SIHS  TD). 
The  SIHS  TD  Vision  Sub-System  Team  plans  to  develop  a  fusion  test  bed  in  the  near  term  to 
quantify  dismounted  soldier  performance.  The  systems  examined  in  this  project  were  projected  to 
be  mature  and  compatible  with  man  packed  applications  by  the  year  2007. 

Over  200  potential  SIHS  TD  imaging  sensors  were  identified  in  this  review.  The  sensors  included 
the  following: 

•  Night  cameras 

o  LLLTV 
o  ICCD 
o  ICMOS 
o  EMCCD 
o  EBCMOS 
o  CCD/CMOS  Hybrid 
o  Colour  CMOS 

•  Thermal  Sensors 

o  Thermal  Light  Valve  (TLV)  CMOS  Camera 
o  SWIR 
o  MWIR/LWIR 
o  Fused  SWIR  &  LWIR 

The  literature  review  identified  considerable  technological  advancements  in  sensor  size  reduction, 
power  demand  reductions,  and  increases  in  resolution.  A  new  thermal  imaging  system  based  upon 
a  passive  optical  filter  called  a  thennal  light  valve  may  provide  significant  benefits  to  future  soldier 
modernization  programs.  Advances  in  the  resolution  of  ICMOS  and  EBCMOS  low  light  cameras 
may  eliminate  the  need  to  incorporate  image  intensified  NVGs  on  future  helmets.  The  report 
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analysed  select  sensor  systems  for  their  suitability  in  the  fusion  test  bed  based  on  sensor  form 
factors,  detector  resolution,  and  real  time  performance.  Recommendations  on  what  sensors  to 
include  in  the  fusion  test  bed  are  included. 

The  literature  review  also  identified  COTS  fusion  boards  that  could  accelerate  the  SIHS  TD  Vision 
Sub-System  Team’s  fusion  test  bed  development.  State  of  the  art  fusion  processing  system 
architectures  are  described.  The  report  analyses  selected  fusion  systems  based  on  their  ability  to 
handle  up  to  four  sensors,  real  time  image  fusion,  open  architecture  and  a  relatively  small  form 
factor 

The  report  provides  an  in-depth  introduction  into  image  fusion  approaches.  A  list  of  potential 
fusion  algorithms  were  identified  based  upon  on  the  number  of  times  cited,  availability  of 
information,  and  the  applicability  to  night  vision  image  fusion  test  bed.  Algorithms  reviewed 
include  the  following: 

•  Pixel  Level  Image  Fusion 

o  Simple  Averaging  Technique 
o  Principal  Components  Analysis  (PCA) 

•  Pyramid  Based  Fusion  Schemes 

o  Laplacian  Pyramid  Algorithm  (LAP) 
o  Morphological  Pyramid  Algorithm  (MORPH) 
o  Gradient  Pyramid  Algorithm  (GRAD) 
o  Ratio  of  Low-Pass  Pyramid  Algorithm  (RoLP) 

•  Wavelet  Transforms  (WT) 

o  Discrete  Wavelet  Transform  (DWT) 
o  Shift-Invariant  Discrete  Wavelet  Transform  (SiDWT) 

•  Feature  Level  Image  Fusion 

o  Edge  Detection  Method 

•  Decision  Level  Image  Fusion 

Recommendations  on  what  fusion  algorithms  should  be  examined  in  the  fusion  test  bed  are 
provided. 

The  ultimate  aim  of  image  fusion  is  to  create  a  faithful  and  composite  image  that  retains  the 
important  information  from  the  source  images  while  minimizing  the  noise  caused  by  fusing  the 
images.  For  the  SIHS  application,  these  images  will  be  typically  viewed  and  interpreted 
(perceived)  by  an  operator.  A  number  of  subjective  and  objective  evaluation  approaches  and 
metrics  have  been  proposed  in  the  literature  to  quantify  and  qualify  image  fusion  performance. 
While  subjective  evaluation  approaches  generally  follow  a  signal  detection  paradigm,  objective 
approaches  differ  considerable.  Four  general  approaches  to  objective  evaluation  were  identified: 
methods  based  on  statistical  characteristics,  methods  based  on  definition,  methods  based  on 
information  theory;  and  methods  based  on  important  features.  COTS  fusion  evaluation  modules 
available  for  use  by  the  Vision  SST  are  provided. 

Recommendations  on  what  valid  fusion  metrics  should  be  utilized  in  the  fusion  test  bed  are 
provided.  Improvements  to  fusion  subjective  evaluation  approaches  are  also  provided. 

Finally,  summary  suggestions  for  the  Vision  SST  fusion  test  bed  are  provided. 
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Le  rapport  a  principalement  pour  objectifs  de  determiner  et  d’examiner  le  domaine  de  la  fusion 
d’ images  et  des  technologies  d’appui,  ainsi  que  de  recommander  des  systemes,  des  algorithmes  et 
des  mesures  pour  le  banc  d’essai  de  fusion  de  Tequipe  des  sous-systemes  de  vision,  dans  le  cadre 
de  la  demonstration  de  technologie  -  casque  integre  pour  soldat  (DT  -  SIHS).  Une  recherche  de  la 
documentation  pertinente  a  ete  effectuee  dans  les  bases  de  donnees  suivantes  :  Psyclnfo,  National 
Technical  Information  Service  (NTIS),  SPIE,  IEEE,  Genie  optique,  GlobalSpec,  Rapports  de 
recherche  de  la  Defense  et  World  Wide  Web  (www).  Les  combinaisons  «  Image  Fusion  »  (fusion 
d’ images)  et  «  Sensor  »  (capteur),  ainsi  que  «  Hardware  »  (materiel)  et  «  Multi-sensor  »  (multi- 
capteurs),  ont  ete  utilisees  comme  mots-cles.  Lorsqu’un  mot-cle  donnait  des  references  impossibles 
a  traiter  (en  trop  grand  nombre),  on  ajoutait  systematiquement  des  mots-cles  supplementaires  pour 
raffiner  la  recherche.  En  general,  cette  fa£on  de  proceder  a  donne  de  multiples  references  non 
pertinentes.  Des  resultats  plus  fructueux  ont  ete  obtenus  des  techniques  «  boule  de  neige  » 
consistant  a  debuter  par  des  auteurs  et  des  documents  connus,  puis  a  suivre  les  references  qu’ils 
foumissaient  a  d’autres  ouvrages.  A  la  fin  de  la  recherche,  environ  150  documents  d’utilite 
immediate  avaient  ete  trouves  pour  Texamen. 

Le  rapport  presente  en  detail  les  technologies  et  les  mesures  d’evaluation  du  materiel  et  du  logiciel 
de  fusion.  Les  facteurs  influant  sur  le  rendement  sont  exposes.  L’examen  fait  egalement  ressortir 
les  tendances  de  developpement  applicables  a  diverses  technologies  de  capteurs,  methodes  de 
fusion  et  mesures  d’evaluation  existantes  et  emergentes.  Les  travaux  visent  essentiellement  a 
determiner  les  technologies  prometteuses  de  fusion  et  detection,  qui  pourraient  etre  utilisees  dans  le 
cadre  de  la  DT  -  SIHS. 

L’equipe  des  sous-systemes  de  vision  de  la  DT  -  SIHS  planifie  le  developpement  a  court  terme 
d’un  banc  d’essai  de  fusion  permettant  de  quantifier  le  rendement  des  soldats  debarques.  Les 
systemes  examines  dans  le  cadre  de  ce  projet  devraient  etre  au  point  et  compatibles  avec  les 
applications  portatives  d’ici  2007.  Plus  de  200  capteurs  d’imagerie  possibles  pour  la  DT  -  SIHS 
sont  indiques  dans  le  rapport.  II  s’agit  notamment  des  capteurs  suivants  : 


•  Cameras  de  nuit 
o  LLLTV 
oICCD 
o  ICMOS 
o  EMCCD 
o  EBCMOS 
o  Hybride  CCD/CMOS 
o  CMOS  couleur 
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*  Capteurs  thermiques 

o  Camera  CMOS  a  modulateur  de  lumiere  thermique  (TLV) 
o  SWIR 
o  MWIR/LWIR 
o  SWIR  et  LWIR  fusionnes 


L’examen  de  la  documentation  a  fait  ressortir  des  progres  technologiques  considerables  en  matiere 
de  reduction  de  la  taille  des  capteurs,  de  reduction  de  la  puissance  consommee  et  d’ augmentation 
de  la  resolution.  Un  nouveau  systeme  d’imagerie  thermique  base  sur  un  filtre  optique  passif  appele 
modulateur  de  lumiere  thermique  pourrait  procurer  des  avantages  appreciates  dans  le  cadre  des 
futurs  programmes  de  modernisation  du  soldat.  Les  progres  en  matiere  de  resolution  des  cameras  a 
bas  niveau  de  lumiere  ICMOS  et  EBCMOS  peuvent  eliminer  la  necessity  d’incorporer  des  LVN  a 
renforcement  d’ image  aux  futurs  casques.  Le  rapport  analyse  des  systemes  de  capteurs  selectionnes 
pour  etablir  leur  adaptabilite  au  banc  d’essai  de  fusion  en  fonction  des  facteurs  de  forme  des 
capteurs,  de  la  resolution  des  detecteurs  et  du  rendement  en  temps  reel.  Des  recommandations  sont 
incluses  quant  aux  capteurs  a  integrer  au  banc  d’essai  de  fusion. 


L’examen  de  la  documentation  a  egalement  permis  de  determiner  des  cartes  de  fusion 
commerciales  courantes  qui  pourraient  accelerer  le  developpement  du  banc  d’essai  de  fusion  de 
l’equipe  des  sous-systemes  de  vision,  dans  le  cadre  de  la  DT  -  SIHS.  Des  architectures  avancees  de 
systeme  de  traitement  de  fusion  sont  decrites.  Le  rapport  analyse  des  systemes  de  fusion 
selectionnes  en  fonction  de  leur  aptitude  a  traiter  jusqu’a  quatre  capteurs,  la  fusion  des  images  en 
temps  reel,  une  architecture  ouverte  et  un  facteur  de  forme  relativement  bas.  Le  rapport  presente  en 
detail  des  methodes  de  fusion  d’images.  Une  liste  des  algorithmes  de  fusion  possibles  est  dressee, 
selon  le  nombre  des  citations,  la  disponibilite  de  l’information  et  l’application  au  banc  d’essai  de 
fusion  des  images  de  vision  nocturne.  Les  algorithmes  suivants  sont  examines  : 


•  Lusion  d’images  au  niveau  des  pixels 

o  Technique  d’etablissement  de  moyenne  simple 
o  Analyse  des  composantes  principales  (PCA) 

•  Pyramid  Based  Lusion  Schemes 

o  Algorithme  pyramidal  de  Laplace  (LAP) 
o  Algorithme  pyramidal  morphologique  (MORPH) 
o  Algorithme  pyramidal  en  gradient  (GRAD) 
o  Rapport  d’algorithme  pyramidal  passe-bas  (RoLP) 

•  Transformees  d’ondelettes  (WT) 

o  Transformee  d’ondelettes  discretes  (DWT) 

o  Transformee  d’ondelettes  discretes  invariante  par  decalage  (SiDWT) 
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Fusion  d’ images  au  niveau  des  elements 
o  Methode  de  detection  des  bords 
Fusion  d’ images  au  niveau  des  decisions 


Des  recommandations  portent  sur  les  algorithmes  de  fusion  qu’il  y  a  lieu  d’examiner  pour  le  banc 
d’essai  de  fusion.  Le  but  ultime  de  la  fusion  d’ images  consiste  a  creer  une  image  fidele  et 
composite  qui  conserve  l’information  importante  des  images  de  la  source  tout  en  reduisant  le  bruit 
cause  par  la  fusion  des  images.  Pour  F application  SIFIS,  ces  images  seront  typiquement  visualisees 
et  interpretees  (perpues)  par  un  operateur.  Un  certain  nombre  de  methodes  et  de  mesures 
d’ evaluation  subjective  et  objective  sont  proposees  dans  la  documentation  en  vue  de  la 
quantification  et  de  la  qualification  du  rendement  de  fusion  d’ images.  Bien  que  les  methodes 
d’ evaluation  subjective  soient  generalement  conformes  a  un  paradigme  de  detection  des  signaux, 
les  methodes  objectives  different  considerablement.  Quatre  methodes  generates  devaluation 
objective  sont  determinees  :  methodes  fondees  sur  des  caracteristiques  statistiques,  methodes 
fondees  sur  des  definitions,  methodes  fondees  sur  la  theorie  de  F information  et  methodes  fondees 
sur  des  elements  importants.  Des  modules  commerciaux  courants  devaluation  de  la  fusion  sont 
mis  a  la  disposition  de  Fequipe  des  sous-systemes  de  vision.  Des  mesures  de  fusion  valides  sont 
recommandees  pour  le  banc  d’essai  de  fusion.  Des  details  sont  egalement  foumis  sur  les 
ameliorations  qu’il  y  a  lieu  d’apporter  aux  methodes  devaluation  subjective  de  la  fusion.  Enfin, 
des  suggestions  sommaires  sont  presentees  pour  le  banc  d’essai  de  fusion  de  Fequipe  des  sous- 
systemes  de  vision. 
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1  Introduction 


Effective  system  integration,  especially  with  regard  to  head-bome  systems,  remains  one  of  the 
biggest  challenges  in  soldier  modernization  R&D.  While  several  allied  Soldier  Modernization 
Programs  (SMPs)  are  developing  prototype  future  headwear  systems  by  adding  sensing, 
information  display,  and  communications  technologies  to  existing  helmets,  little  or  no  progress  has 
been  made  in  integrating  enhanced  ballistic,  Chemical  Biological  (CB),  blast  or  thermal  protection 
into  the  system.  In  fact,  in  many  cases,  trade-offs  with  protection  have  been  made  in  order  to 
accommodate  the  specific  technologies.  Thus,  a  fully  integrated  head  system  design  that  properly 
addresses  future  operational  technology  requirements,  personnel  protection,  and  human  factors  and 
performance  issues  is  not  the  focus  of  current  SMPs.  This  work  is  critical  to  success  of  the 
Canadian  Land  Staff  (CLS)  Capital  Acquisition  Program  called  Integrated  Soldier  System  Platform 
(ISSP). 

The  Soldier’s  Integrated  Helmet  System  Technology  Demonstrator  (SIHS  TD)  project  will  develop 
and  demonstrate  three  unique  technology  concepts  that  represent  different  levels  of  integration.  The 
concepts  will  range  from  a  combined  add-on  system  where  components  are  added  piecemeal  to 
existing  headwear  systems,  through  a  bottom-up-designed  modular/compatible  approach  where 
subsystem  functionality  can  be  added  or  removed  as  and  when  needed,  to  a  fully  and  permanently 
encapsulated  design  where  weight,  space,  protection  and  functionality  are  optimized  maximally. 

The  SIHS  programme  will  empirically  determine  the  most  promising  headwear  integration  concept 
that  significantly  enhances  the  survivability  and  effectiveness  of  the  future  Canadian 
soldier/warfighter  by  developing,  evaluating,  and  demonstrating  novel  concepts  for  integrating 
enhanced  protection,  sensing,  information  display,  and  communications  technologies  into  a 
headwear  system  (Tack,  2007).  To  this  end  SIHS  has  developed  a  number  of  helmet  concepts  that 
include  novel  sensors  -  see  Figure  1 . 


Figure  1:  SIHS  Concept  3  -  C4l/Survivability 
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A  variety  of  imaging  sensors  are  available  for  inclusion  in  the  SIHS  TD  and  each  sensor  has 
particular  strengths  and  weakness.  One  proposed  approach  is  to  utilize  fused  sensors  (Angel, 
Vilhena  and  Morton,  2006a).  Multi-sensor  image  fusion  has  become  a  valuable  reality  in  defence 
applications.  The  benefits  of  image  fusion  have  also  been  demonstrated  in  a  large  number  of 
studies  (Toet  and  Ijspeert,  1997;  Dixon,  Canga,  Noyes,  Troscianko,  Nikolov,  Bull  and 
Canagarajah,  2006;  Angel  and  Vilhena,  2005,  etc.)  The  results  suggest  that  the  SIHS  TD  should 
investigate  the  impact  of  fusion  on  dismounted  soldier  activities. 

In  association  with  Defence  Research  and  Development  Canada  (DRDC)  Valcartier,  and  the 
Electro-Optic  Test  Facility  (EOTF)  of  the  United  States  Marine  Corps  (USMC),  the  SIHS  TD 
Vision  Sub-System  Team  (SST)  is  exploring  sensor  imagery  fusion  as  part  of  the  SIHS  TD. 
Previously,  DRDC  Valcartier  investigated  fusion  algorithms  and  man-portable  fusion  systems  in 
the  past,  but  this  work  is  now  almost  five  years  out  of  date.  The  Vision  SST  and  EOTF  have 
developed  an  initial  research  proposal  to  investigate  fusion  for  the  SIHS  TD.  An  outline  of  the 
proposed  work  is  as  follows: 

1 .  Conduct  a  literature  review  to  identify  current  fusion  capabilities,  current  hardware,  current 
software  algorithms,  and  any  promising  technologies  for  the  future. 

2.  Evaluate  algorithm  options  to  more  clearly  define  and  understand  the  various  effects  and 
transformations  the  potential  algorithms  generate  on  imagery. 

3.  Acquire  sensors  and  hardware.  In  collaboration  between  DRDC  Toronto,  DRDC 
Valcartier  and  EOTF,  sensing  devices  of  interest  will  be  acquired. 

4.  Identify  the  required  characteristics  of  raw  imagery  to  be  collected  for  fusion  studies. 

These  characteristics  should  include  season,  numbers  and  types  of  targets  (person/vehicle, 
mobile/stationary  or  a  mixture),  and  lighting. 

5.  Collect  imagery.  EOTF,  in  collaboration  with  DRDC  Valcartier,  will  collect  the  imagery. 

6.  Conduct  psychophysical  tests  on  fusion  imagery  to  quantify  operator  performance.  In 
collaboration  between  DRDC  Toronto,  DRDC  Valcartier  and  EOTF  subjective  and 
objective  testing  will  be  undertaken. 

Given  the  potential  benefits  to  Canadian  and  USMC  SMPs,  support  was  given  to  the  Vision  SST  to 
conduct  the  state  of  the  art  literature  review.  This  report  will  outline  the  results  of  the  literature 
review.  The  review  investigated  the  latest  trends  in  imaging  sensors,  fusion  hardware,  software, 
and  evaluation  metrics.  Based  on  the  findings  of  this  literature  review,  a  way  a  head  for  the  Vision 
SST  fusion  study  will  be  proposed. 

1.1  Electromagnetic  Spectrum 

A  basic  knowledge  of  the  electromagnetic  spectrum  is  helpful  to  understand  the  current  and 
emerging  night  vision  and  vision  enhancement  technologies.  The  electromagnetic  spectrum  is  a 
term  to  describe  the  range  of  energy  wavelengths  emitted  by  any  object  or  living  creature.  All 
objects  emit  infrared  energy  and  this  amount  is  proportional  to  the  temperature  of  the  object. 
Warner  objects  emit  more  energy.  Figure  2  shows  the  spectrum.  Except  for  the  visible,  all 
spectrums  cannot  be  seen  by  the  human  eye.  In  order  to  make  the  non-visible  spectrum  visible  to 
the  human  eye,  technologies  have  been  developed  that  convert  or  amplify  energies  to  the  visible 
spectrum. 
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Figure  2:  (Top)  Nomenclature  for  various  parts  of  the  electromagnetic  spectrum 
(Bottom).  A  picture  simultaneously  imaged  in  various  parts  of  spectrum  (Wolff, 

Socolinsky,  Eveland,  2006) 

1.2  Applications  of  Image  Fusion 

Image  fusion  of  multispectral  images  has  been  increasingly  studied  to  enhance  performance  in 
military  applications.  With  the  growing  availability  of  Commercial-off-the-Shelf  (COTS) 
sensors/cameras,  that  image  in  VIS-NIR,  SWIR,  MWIR,  and  LWIR,  there  is  a  corresponding 
increase  in  the  practical  exploitation  of  different  fusion  combinations  between  any  of  these 
respective  spectrums  (Wolff  et  al,  2006). 

There  are  numerous  applications  of  image  fusion  in  the  military  domain.  Applications  of  image 
fusion  for  defence  applications  include  automatic  target  recognition  (ATR),  identification- friend- 
foe-neutral  (IFFN),  and  battlefield  surveillance  and  situation  assessment.  In  some  applications  the 
degree  of  fusion  may  be  set  by  the  user  to  select  between  sensor  fusion  outputs.  For  example,  the 
degree  of  infrared  and  thermal  may  be  adjusted  and  this  will  vary  the  hue  of  the  image. 

The  benefits  of  multi-sensor  image  fusion  include  (Angel,  Vilhena,  and  Morton,  2007): 

•  Extended  range  of  operation; 

•  Extended  spatial  and  temporal  coverage; 

•  Reduced  uncertainty; 

•  Increased  reliability; 

•  Robust  system  performance;  and 

•  Compact  representation  of  information. 

The  US  Defense  Advanced  Research  Projects  Agency  (DARPA)  is  currently  exploring  fusion  in  its 
Multispectral  Adaptive  Networked  Tactical  Imaging  System  (MANTIS).  The  goal  of  the  MANTIS 
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program  is  to  demonstrate  a  visualization  system  to  regain  the  night-time  advantage  for  the 
individual  soldier  and  provide  unprecedented  situational  awareness.  MANTIS  consists  of: 

•  A  head-mounted,  multispectral  sensor  suite  (Vis/  NIR/SWIR/LWIR),  digital  display  and  an 
inertial  navigation  system;  and 

•  A  body-worn  processor  and  power  supply,  to  digitize,  process,  and  display  fused  imagery, 
augmented  reality  and  battlefield  information  in  real  time.  MANTIS  will  provide  small 
units  with  network-enabled,  collaborative  visualization  for  soldier-to-soldier  image 
sharing,  access  to  remote  sensors  and  targeting  handoff  to  off-board  weapons,  allowing  the 
soldier  to  point,  click  and  kill. 


2  Aim 


The  purpose  of  this  project  was  to  identify  and  review  the  field  of  image  fusion  and  contributing 
technologies  and  to  recommend  systems,  algorithms  and  metrics  for  the  proposed  SIHS  TD  Vision 
SST  fusion  test  bed. 


2.1  Abbreviations 


AGC 


Automatic  Gain  Control 
Automatic  Target  Recognition 
Airborne  Underwater  Geophysical 
Chemical  Biological 
Charge  Coupled  Device 

Canada  Institute  for  Scientific  and  Technical  Information 
Canadian  Land  Staff 

Complementary  Metal-Oxide-Semiconductor 

Commercial-Off-the-Shelf 

Defence  Advanced  Research  Projects  Agency 

Defence  Research  and  Development 

Double  Stimulus  Continuous  Quality  Evaluation 

Discrete  Wavelet  Transform 

Electron  Bombarded  Active  Pixel  Sensor 

Electron  Bombarded  CMOS 

Electron  Multiplying  CCD  Charge  Coupled  Device 

Enhanced  Night  Vision  Goggle 

Electro-Optic  Test  Facility 

Forward-Looking  Infra-red 

Field  Of  View 

Focal  Plane  Array 

Filter  Subtract  Decimate 


ATR 


AUG 


CB 

CCD 

CISTI 

CLS 

CMOS 

COTS 


DARPA 


DRDC 

DSCQE 

DWT 


EBAPS 

EBCMOS 


EMCCD 


ENVG 


EOTF 


FLIR 


FOV 


FPA 


FSD 
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GaAs 

Gallium  Arsenide 

GIFT 

Generalised  Image  Fusion  Toolkit 

GRAD 

Gradient  Pyramid  Algorithm 

HMD 

Helmet  Mounted  Display 

HSI 

Humansystems  Inc. 

I2  or  II 

Image  Intensified 

ICCD 

Intensified  CCD 

ICMOS 

Intensified  Complementary  Metal-Oxide-Semiconductor 

IEEE 

Institute  of  Electrical  and  Electronics  Engineers 

IFFN 

Identification-Friend-Foe-Neutral 

IFPM 

Image  Fusion  Performance  Measure 

In  GaAs 

Indium  Gallium  Arsenide 

IQI 

Image  Quality  Index 

IR 

Infrared 

ISSP 

Integrated  Soldier  System  Platform 

UK 

Insight  Toolkit 

LADAR 

Laser  Detection  And  Ranging 

LAP 

Laplacian  Pyramid  Algorithm 

LLL 

Low  Level  Light 

LLLTV 

Low  Level  Light  Television 

LWIR 

Long  Wave  Infrared 

MANTIS 

Multi-Spectral,  Adaptive,  Networked  Tactical  Imaging  System 

MBTI 

Myers-Briggs  Type  Indicator 

Ml 

Mutual  Information 

MORPH 

Morphological  Pyramid  Algorithm 

MOS 

Mean  Opinion  Score 

MR 

Multi-Resolution 

MSD 

Multiscale-Decomposition 

MWIR 

Medium  Wave  Infrared 

NATO 

North  Atlantic  Treaty  Organisation 

NGEOS 

Northrop  Grumman  Electro-Optical  Systems 

NIR 

Near  Infrared 

NMSD 

Non-Multiscale-Decomposition 

NTIS 

National  Technical  Information  Service 

NVD 

Night  Vision  Device 

NVESD 

Night  Vision  and  Electronic  Sensors  Director/Directorate  (US  Army) 

NVG 

Night  Vision  Goggle 

PCA 

Principal  Component  Analysis 

PSNR 

Peak  Signal  to  Noise  ratio 

Q 

Fusion  Quality  Measure/Index 
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QE 

Edge  Dependent  Fusion  Quality  Index 

QMF  DWT 

Quadrature  Mirror  Filter  Discrete  Wavelet  Transform 

QW 

Weighted  Fusion  Quality  Index 

QWIP 

Quantum  Well  Infrared  Photo  Detector 

R&D 

Research  and  Development 

RMSE 

Root  Mean  Square  Error 

ROC 

Receiver  Operating  Characteristic 

ROIC 

Read  Out  Integrated  Circuit 

RoLP 

Ratio  of  Low  Pass  Pyramid  Algorithm 

SA 

Situational  Awareness 

SF 

Spatial  Frequency 

SiDWT 

Shift-invariant  Discrete  Wavelet  Transform 

SIHSTDP 

The  Soldier  Integrated  Fleadwear  System  Technology  Demonstrator 

SIT 

Silicon  Intensified  Target 

SMaRTS 

Soldier  Mobility  and  Rifle  Targeting  System 

SMPs 

Soldier  Modernization  Programs 

SNR 

Signal  Noise  Ratio 

SPIE 

The  International  Society  for  Optical  Engineering 

SSCQE 

Single  Stimulus  Continuous  Quality  Evaluation 

SST 

Sub-System  Team 

STINET 

Scientific  and  Technical  Information  Network 

SWIR 

Short  Wave  Infrared 

TBIR 

Target-Background  Interference  Ratio 

Tl 

Thermal  Imaging 

HR 

Target  Interference  Ratio 

TNO 

Netherlands  Organisation  for  Applied  Scientific  Research 

UIQI 

Universal  Image  Quality  Index 

USMC 

United  States  Marine  Corps 

VDA 

Visual  Difference 

VIS 

Visible 

VOx 

Vanadium  Oxide 

WT 

Wavelet  Transforms 

WWW 

World  Wide  Web 
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This  section  outlines  the  methodology  used  in  this  scientific/academic  search.  Given  the  broad 
areas  to  investigate,  a  three  member  team  approach  was  utilized.  Each  member  of  the  team  was 
primarily  responsible  for  one  area  of  the  research: 

•  Hardware  -  sensors,  fusion  boards; 

•  Software  -  fusion  algorithms;  and 

•  Factors  -  evaluation  metrics. 

3.1  Keywords 

A  set  of  keywords  were  developed  by  the  project  team  for  the  literature  search  based  on  our 
experience  with  the  pertinent  technological,  scientific,  and  military  domains.  These  keywords  were 
chosen  because  they  focused  the  search  on  topics  directly  related  to  sensor  fusion,  sensor  hardware, 
software,  and  evaluation  metrics.  The  following  keywords  (Table  1)  were  used  in  combination  to 
search  easily  accessible  databases.  The  words  were  used  in  combination  (one  word  from  primary, 
then  one  word  from  secondary  would  be  added,  then  one  word  from  tertiary  would  be  added  until 
all  combinations  of  primary  with  secondary  with  tertiary  words  are  searched).  If  an  unmanageable 
number  of  hits  results  from  a  search  with  three  words,  additional  modifiers  (from  the  keyword  list) 
were  used  to  focus  the  results. 


Table  1:  Primary,  secondary  and  tertiary  keywords  for  sensor  fusion,  hardware, 

software,  and  metrics 


Core  Concept 

Primary  Keywords 

Related  Keywords 

Image  Fusion 

Systems 

Indirect  view,  direct  view,  emerging,  enhanced,  low  light, 
optical,  digital,  biologically-inspired,  range-gated 

Multi-sensor  fusion 

Application 

Area 

Primary  Keywords 

Related  Keywords 

Sensor 

LLLTV 

CCD 

I2 

Night  vision  goggles,  weapon  sights,  hand-held  systems,  tripod 
mounted  systems,  thermal  sights,  UAV 

NIR 

CMOS 

EBAPS 

Tl 

SWIR 

MWIR 

LWIR 

DAY 

Visible 

FLIR 

IR 

Vendors:  DRS  Technologies,  Woodburn,  Northrop  Grumman, 
Sensors  Unlimited,  Nivisys,  Insight  technology,  Elcan,  FLIR 
Systems,  Stanford  photonics 

Hardware 

Sensor  fusion  processors 

Video  processing  boards 

Image,  video  capture  cards 

Vendors:  Octec,  Equinox,  Sarnoff, 

TNO,  NVESD 
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Core  Concept 

Primary  Keywords 

Related  Keywords 

Software 

Algorithms,  image  fusion,  pixel  level,  feature 
level,  decision  level 

Techniques,  analysis,  methods,  shift  invariant  discrete  wavelet 
transform,  Laplacian  pyramid,  principle  component,  filter- 
subtract-decimate  (FSD),  gradient,  Gaussian  pyramid, 
morphological,  contrast  pyramid,  ratio  of  low  pass  pyramid, 
contrast 

Metrics 

Evaluation,  analysis,  measure 

Performance 

Objective 

Subjective 

Quantitative 

Total  probability  density  function 

Comparative,  quantifying 

Image  quality  index 

Fusion  quality  index 

Quantitative  correlation  index 

Mutual  information 

Weighted  fusion  quality  index 

Edge  dependent  fusion  quality  index 

Spatial  detail 

Spectral  information 

Spatial  resolution 

Signal  to  noise 

Distortion 

Fisher  distance 

Fechner-Weber  contrast 

Target-background  interference  ratio  (TBIR) 

The  core  concept  keywords  were  the  most  important  words  used  in  the  search,  as  they  represent  the 
broad  concepts  to  be  investigated.  As  necessary,  the  primary  keywords  were  used  in  order  to 
ensure  sampling  of  literature  from  several  different  areas  within  the  core  concept.  For  example, 
when  searching  with  the  “sensor”  core  concept,  primary  keywords  such  as  “NIR”  and  “LLLTV” 
may  or  may  not  emerge.  The  purpose  of  the  primary  keywords  was  to  ensure  that  research  related 
to  several  different  aspects  of  sensor  fusion  was  explored. 

3.2  Databases 

The  following  were  primary  databases  that  were  the  most  relevant  for  searching  the 
scientific/academic  literature: 


Table  2:  Primary  Databases  for  Scientific/Academic  Search 


Database 

Description 

SPIE  -  The  International  Society 
for  Optical  Engineering 

The  SPIE  Digital  Library  is  a  resource  for  optics  and  photonics  information.  It  contains  more  than 
70,000  full-text  papers  from  SPIE  Journals  and  Proceedings  published  since  1998.  It  also 
includes  citations  and  abstracts  for  most  SPIE  papers  published  since  1993.  Approximately 

15,000  new  papers  will  be  added  each  year.  (SPIE,  2007) 

IEEE  -  Institute  of  Electrical  and 
Electronics  Engineers,  Inc 

The  IEEE,  a  non-profit  organization,  is  the  world's  leading  professional  association  for  the 
advancement  of  technology.  The  IEEE  publishes  nearly  a  third  of  the  world’s  technical  literature 
in  electrical  engineering,  computer  science  and  electronics.  This  includes  about  130  journals, 
transactions  and  magazines  and  over  400  conference  proceedings  published  annually.  IEEE 
journals  are  consistently  among  the  most  highly  cited  in  electrical  and  electronics  engineering, 
telecommunications  and  other  technical  fields.  (IEEE,  2007) 

NTIS  -  National  Technical 
Information  Service 

NTIS  is  an  agency  of  the  U.S.  Department  of  Commerce's  Technology  Administration.  It  is  the 
official  source  for  government  sponsored  U.S.  and  worldwide  scientific,  technical,  engineering, 
and  business  related  information.  The  database  contains  almost  three  million  titles,  including 
370,000  technical  reports  from  U.S.  government  research.  The  information  in  the  database  is 
gathered  from  U.S.  government  agencies  and  government  agencies  of  countries  around  the 
world.  (NTIS,  2007) 

CISTI  -  Canada  Institute  for 
Scientific  and  Technical 

Information  (CISTI) 

CISTI  houses  a  comprehensive  collection  of  publications  in  science,  technology,  and  medicine. 

It  contains  over  50,000  serial  titles  and  600,000  books,  reports,  and  conference  proceedings 
from  around  the  world.  (CISTI,  2007) 
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The  following  were  secondary  databases  for  searching  the  scientific/academic  literature: 


Table  3:  Primary  Databases  for  Scientific/Academic  Search 


Database 

Description 

STINET  -  Scientific  and 

Technical  Information  Network 

STINET  provides  access  to  citations  of  unclassified  unlimited  documents  that  have  been  entered 
into  DTIC's  Technical  Reports  Collection,  as  well  as  the  electronic  full-text  of  many  of  these 
documents.  Public  STINET  also  provides  access  to  the  Air  University  Library  Index  to  Military 
Periodicals,  Staff  College  Automated  Military  Periodical  Index,  DoD  Index  to  Specifications  and 
Standards,  and  Research  and  Development  Descriptive  Summaries.  (STINET,  2007) 

GlobalSpec 

GlobalSpec  is  the  leading  specialized  vertical  search,  information  services  and  e-publishing 
company  serving  the  engineering,  manufacturing  and  related  scientific  and  technical  market 
segments.  GlobalSpec  has  l/PRO  audited  Web  site  traffic,  and  a  global  user  base  of  more  than 
3,400,000  registered  users;  a  user  community  that  continues  to  grow  by  more  than  80,000  new 
registrants  each  month.  In  addition,  the  company  has  acquired  3,500,000  opt-in,  online  readers 
of  its  suite  of  product-specific  e-newsletters  that  cover  the  electrical  and  mechanical  engineering 
products  markets,  as  well  as  other  segments  of  the  electronics,  scientific  and  manufacturing 
industries.  GlobalSpec  is  increasingly  becoming  "the  place"  where  the  engineering  community 
gathers  and  conducts  business.  (GlobalSpec,  2007) 

In  addition,  the  World  Wide  Web  (www)  was  searched  with  all  the  keywords. 


3.3  Search  Strategy 

The  project  team  systematically  searched  the  databases  using  the  keywords  specified.  For 
example,  the  first  keyword  search  series  consisted  of  the  core  concepts  listed  in  Table  1:  “Image 
Fusion”  and  “Sensor”,  “Hardware”  and  “Multi-sensor”.  Other  searches  at  this  level  used  primary 
keyword  variations,  for  example,  “Indirect  view”  and  “multi-sensor”.  When  a  keyword  yielded  an 
unmanageable  (too  many)  number  of  references,  the  researcher  systematically  added  additional 
primary  keywords  to  refine  the  search.  When  a  keyword  yielded  too  few  searches,  less  narrow 
concepts  were  used  until  the  precise  level  of  analyses  has  been  reached. 

Once  core  concept  and  primary  keyword  searches  were  conducted  within  the  primary  databases,  all 
abstracts  were  reviewed.  In  the  case  of  the  GlobalSpec  database,  all  product  information  sheets 
were  reviewed. 

Secondary  databases  were  explored  in  order  to  ensure  that  sensor  fusion  products  (hardware  and 
software)  were  accessed.  The  research  team  reviewed  abstracts  or  technical  data  sheets  for 
adequacy  of  relevance,  quantity,  and  quality.  If  necessary,  searches  were  refined  and/or  revised  and 
continued  using  secondary  level  keywords.  The  project  manager  benchmarked  the  “hits”  found 
during  the  search  with  and  they  are  reported  in  the  Results  section. 


3.4  Analysis  of  Literature 

Given  the  research  area  there  were  multiple  foci  in  the  review  of  articles:  first,  to  identify  specific 
sensor  and  hardware  technologies  available,  second,  to  identify  the  most  promising  fusion 
approaches  available,  and  third,  robust  metrics  to  evaluate  fusion  performance.  Once  identified, 
the  critical  characteristics  of  each  focus  area  were  compiled,  i.e.  for  sensors  critical  characteristics 
included  size,  resolution,  frame  rate  etc.  The  articles/approaches  reviewed  in  the  literature  search 
were  then  assessed  using  the  relevant  criteria. 
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4  Results 


The  results  from  the  literature  search  are  organized  as  follows: 

•  Hardware  -  sensors,  fusion  boards; 

•  Software  -  fusion  algorithms;  and 

•  Factors  -  evaluation  metrics. 

4.1  Hardware 

Both  sensors  and  fusion  boards  were  reviewed.  Although  over  200  sensors  were  identified  in  this 
review,  only  those  that  were  judged  suitable  for  the  SIHS  TD  application  are  presented.  A 
summary  of  the  sensor  specifications,  organized  by  type,  is  provided  in  Annex  A. 

Seven  fusion  boards  were  also  identified  during  this  review.  A  summary  of  the  board 
specifications  is  provided  in  Annex  B. 

4.1.1  Sensors 

The  results  below  present  information  on  several  different  sensor  types:  Day-night  cameras, 
LLLTV,  ICMOS,  EMCCD,  EBCMOS,  Colour  CMOS,  SWIR,  MWIR  and  LWIR.  A  comparative 
analysis  of  potential  sensors  is  organized  by  type  in  the  Discussion  Section  (Section  5). 

4. 1. 1. 1  Day-Night  Cameras 

Unlike  many  security  cameras  which  require  high  intensity  Light  Emitting  Diodes  (LEDs)  to 
illuminate  their  targets,  a  number  of  high  performance  cameras  are  available  for  use  in  low  light 
and  full  sun  conditions.  Typically  these  full  range  cameras  include  signal  enhancements  in  low 
light.  The  DVS24-1000  from  Defence  Vision  Systems  camera  -  see  Figure  3,  provides  a  high 
resolution  image  across  a  wide  dynamic  range. 


Figure  3:  Defence  Vision  Systems  day-night  camera  DVS24-1000  from 

http://213.210.6.54/dvsmil 
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Day-night  cameras  typically  monitor  scene  illumination  (auto  gain)  and  can  control  an  auto  iris  for 
use  with  custom  lenses.  The  performance  of  day  night  cameras  at  night  does  compare  to  the 
performance  of  dedicated  night  cameras. 


4. 1. 1.2  Low  Level  Light  Television  (LLLTV) 

LLLTV  cameras  are  used  in  low  light  level  conditions.  There  are  a  few  distinct  groups  of  LLLTV 
cameras:  Silicon  Intensified  Target  (SIT)  tube  cameras,  Intensified  Silicon  Intensified  Target 
(ISIT)  tube  cameras,  Intensified  Charge  Couple  Device  (ICCD)  cameras,  and  cooled  CCD  cameras 
The  LLLTV  sensor  typically  couples  an  Image  Intensifier  (I2)  tube  with  a  Charged  Couple  Device 
(CCD).  Images  produced  from  the  intensifier  tube  are  displayed  on  the  intensifiers  phosphor 
screen.  These  images  are  relayed  to  a  CCD  camera  by  a  fibre  optic  coupler  or  a  simple  relay  optic, 
with  a  frequency  detection  range  extending  above  the  normal  visible  (0.4  to  0.7  pm)  wavelengths, 
and  into  the  short-wave  Infrared  that  is  usually  to  about  1.0  to  1.1  pm.  The  coupling  of  an  image 
intensifier  tube  to  a  CCD  range  allows  the  human  eye  to  see  objects  in  extremely  low  light  levels. 
The  LLLTV  sensor  technology  reduces  the  images  into  a  series  of  lines. 


It  is  possible  to  improve  the  performance  of  a  non-intensified  CCD  detector  by  cooling  the  detector 
and  using  long  integration  times  to  reduce  noise.  While  cooled  CCD  cameras  can  reach  the 
performance  of  ICCD  cameras,  the  camera  requires  long  integration  times  for  detection,  i.e.  not 
suitable  for  real  time  applications. 


Figure  4:  Micro  ICCD  camera  system  from  Defence  Vision  Systems  (from 

http://21 3.21 0.6.54/dvsmil/PDF) 

ICCD  based  sensors  use  a  special  manufacturing  process  that  creates  the  ability  to  transport  charge 
across  the  chip  without  distortion,  whereas  the  Complementary  Metal  Oxide  Semiconductor 
(CMOS)  sensor  uses  a  traditional  manufacturing  process  as  most  microprocessors.  CCD  based 
sensors  create  high-quality,  low-noise  images,  whereas  CMOS  sensors  are  more  susceptible  to 
noise.  Furthermore,  the  CCDs  have  been  in  mass  production  for  a  long  period  of  time,  therefore 
they  are  more  mature  and  tend  to  have  higher  quality  images  compared  to  CMOS. 

LLLTV  cameras  can  be  used  in  many  applications.  The  findings  of  our  literature  review  showed 
that  many  of  the  applications  are  primarily  for  scientific  and  industrial  applications.  For  example, 
LLLTV  sensors  are  used  in  near-IR  cellular,  dermal,  machine  vision,  high-content  screening,  and 
manufacturing  inspection.  In  terms  of  military  applications,  they  are  used  in  surveillance  imaging 
applications. 
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Annex  A  contains  a  list  of  fifteen  LLLTV  cameras/sensors  that  would  be  suitable  for  the  SIHS 
application.  From  these  products  identified,  there  were  four  different  manufacturers:  DVC, 

Intevac,  NAC  Image  Technology,  and  PCO  Imaging. 

4. 1. 1.3  Intensified  Complementary  Metal  Oxide  Semiconductor  (ICMOS)  Sensor 

CCD  cameras  have  been  replaced  in  many  commercial  applications  by  Complementary  Metal- 
Oxide-Semiconductor  (CMOS),  or  camera-on-a-chip,  systems.  CMOS  image  sensors  operate  at 
lower  voltages  than  CCD,  resulting  in  less  power  consumption  for  dynamic  applications,  such  as  a 
helmet  mounted  system.  CMOS  cameras  also  have  simpler  design  and  may  be  integrated  more 
easily  than  a  CCD.  As  with  CCD  cameras,  CMOS  cameras  can  be  coupled  with  intensifier  tubes 
creating  ICMOS  sensors  -  see  Figure  5.  There  are  two  categories  of  ICMOS  image  sensors: 
analog  and  digital.  Analog  and  digital  processing  functions  can  be  integrated  readily  onto  a  CMOS 
chip.  This  reduces  system  package  size  and  overall  costs. 


Figure  5:  I2  bonded  CMOS  image  sensor 

CMOS  chips  can  be  manufactured  on  any  standard  silicon  production  line,  thereby  making  them 
less  expensive  than  a  CCD  sensor.  Other  advantages  of  CMOS  sensors  include  (Beyondlogic, 
2005): 

•  No  blooming; 

•  Low  power  consumption.  Ideal  for  battery  operated  devices; 

•  Direct  digital  output; 

•  Small  size  and  little  support  circuitry  Often  just  a  crystal  and  some  decoupling;  and 

•  Simple  to  design  with. 

Annex  A  contains  a  list  of  seven  ICMOS  cameras/sensors  that  would  be  suitable  for  the  SIHS 
application.  From  these  products  identified,  there  were  five  different  manufacturers:  Intevac, 
Irvine  Sensors  Corp.,  PCO  Imaging,  Prosilica  Inc,  and  Vision  Research  Inc. 

4. 1.1. 4  Electron  Multiplying  Charge  Coupled  Devices  (EMCCD) 

While  an  ICCD  camera  utilizes  an  image  intensifier  is  placed  in  front  of  the  CCD  chip  to  enhance 
its  light  detection  an  Electron  Multiplying  CCD  (EMCCD)  camera  uses  an  alternative  approach  to 
a  standard  image  intensifier.  EMCCD  cameras  are  currently  being  developed  for  special  scientific 
applications  (microscopy,  spectroscopy,  etc.)-  see  Figure  6. 
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Figure  6:  CoolView  EM/1000  EMCCD  camera  (from  http://www.photonic- 
science.co.uk/zz_CoolView_EM.html) 

EMCCD  cameras  utilize  a  “gain  register”  electron  multiplying  structure.  The  gain  register 
performs  the  same  function  of  the  intensifier  microchannel  plate  but  creates  new  electrons. 

EMCCD  cameras  need  to  be  cooled  to  reduce  readout  noise  (for  the  Coolview  EM/100  this  is  in  the 
order  of-50°C). 

Evaluations  of  EMCCD  by  Dussault  and  Hoess  (2004)  did  not  demonstrate  advantages  of  using 
uncooled  EMCCD  cameras  over  ICCD  systems  -  see  Figure  7.  While  EMCCD  cameras  may  be  a 
credible  alternative  to  ICCDs  for  some  applications,  they  are  not  believed  to  be  adequate  for  SIHS 
TD  applications. 


Figure  7:  EMCCD  and  ICCD  Camera  comparison  in  low  ambient  light  conditions 

Top  row:  Stanford  Photonics  XR-Mega-10  Extreme  1400  x  1024  pixels  ICCD  detector,  33  msec 
exposure,  no  binning.  Middle  row:  Andor  EEV  iXon  EMCCD  camera  (512x512  pixels),  33  msec 
exposure,  no  binning.  Bottom  row:  Roper  Cool  Snap  1400  x  1024  CCD,  33  msec  exposure,  binned 
2x2.  (from  Dussault  and  Hoess  (2004) 


4. 1. 1.5  Electron  Bombarded  CMOS  (EBCMOS)  Sensor 


The  Electron  Bombarded  CMOS  (EBCMOS)  sensor  is  a  relatively  new  type  of  sensor.  Intevac  has 
patented  the  first  EBCMOS  technology  called  the  Electron  Bombarded  Active  Pixel  Sensor 
(EBPAS).  EPABS  is  based  on  the  use  of  GaAs  (Gallium  Arsenide)  photocathode  with  a  high 
resolution,  backside  thinned,  CMOS  Active  Pixel  Sensor  (APS)  imager  anode.  The  photocathode 
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emits  electrons  directly  to  the  CMOS  APS  anode  in  an  electron  bombarded  mode.  A  low  noise  gain 
is  achieved  in  the  CMOS  anode  due  to  the  Electron  Bombarded  semi-conductor  gain  process.  The 
noise  generated  in  the  EBAPS  is  significantly  lower  than  the  noise  output  in  the  Generation-Ill  I2 
module.  This  low  noise  gain  advantage  is  combined  with  modem  semi-conductor  packaging  and 
manufacturing  approaches  to  enable  a  small  EBAPS  module  that  can  be  mass  produced  at  a  low 
cost,  (from  http://www.intevac.com/imaging/technology) 

The  use  of  CMOS  imagers  enables  the  EBAPS  sensor  to  address  some  of  the  key  deficiencies 
found  in  previous  Low  Light  Level  Cameras  such  as,  size,  and  increased  power  consumption.  The 
ultimate  performance  of  the  EBAPS  depends  on  the  architecture  and  design  of  the  CMOS  imager 
and  the  ability  to  produce  an  area  with  a  100%  fill  factor  (no  dead  area).  The  EBAPS  also  achieves 
high  performance  through  the  use  of  the  high  efficiency  GaAs  Photocathode  which  is  sensitive  in 
the  Near-IR  region  of  the  electromagnetic  spectrum. 

The  EBAPS  based  camera  has  significant  performance  differences  relative  to  a  standard  I2  camera. 
Since  the  EBAPS  does  not  utilize  a  microchannel  plate  it  can  be  operated  in  a  day  only  mode  with 
no  high  voltage  applied  to  the  sensor.  This  mode  of  operation  enables  high  performance  near-IR 
imagery  to  be  obtained  in  the  day  without  any  impact  on  the  sensors  operational  life. 

The  new  EBAPS  ISIE10  camera  developed  by  Intevac  surpasses  all  previous  EBAPS  models  due 
to  its  reduction  in  noise  from  the  CMOS  imager  and  the  increase  in  sensor  size  by  enlarging  the 
pixel  size  to  10.9  pm.  The  development  of  CMOS  imagers  directly  affects  the  performance  of 
EBAPS  sensors.  With  new  generation  CMOS  imagers  the  resolution  of  the  EBAPS  substantially 
increases,  as  well  as,  increases  in  the  perfonnance  of  target  recognition  measures. 

The  EBAPS  camera  offers  substantially  smaller  size  and  weight  than  present  Low  Light  Level 
cameras.  The  EBAPS  also  has  a  low  sensor  profile  of  approximately  3  cm  compared  to  standard  I2 
goggles.  This  reduces  the  likelihood  of  entanglement  in  an  operational  environment,  as  well  as, 
places  the  centre  of  gravity  in  a  more  favourable  position  with  respect  to  the  neck.  The 
performance  of  the  EBAPS  ISIE10  is  thought  to  rival  the  Gen-III  NVG  goggle  but  in  a  much  more 
favourable  package. 


SILICON 


Figure  8:  EBAPS  design  (Aebi  et  al,  2005) 

Annex  A  contains  a  list  of  three  EBAPS  cameras/sensors  that  would  be  suitable  for  the  SIHS 
application.  From  these  products  identified,  there  was  only  one  manufacturer,  Intevac.  A  report 
presented  by  Aebi  et  al.  (2005)  to  the  OPTRO  2005  International  Symposium  describes  the 
advantages  of  the  EBAPS  system  over  current  systems.  That  information  provides  the  basis  for  the 
summary  presented  here. 
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4. 1. 1. 6  CCD/CMOS  Hybrid 

Fairchild  Imaging  has  created  a  CCD/CMOS  hybrid  Focal  Plane  Array  (FPA)  for  low  light  level 
imaging  applications.  This  approach  combines  the  best  of  CCD  imaging  characteristics:  high 
quantum  efficiency,  low  dark  current,  excellent  uniformity,  and  low  pixel  cross  talk,  with  the  high 
speed,  low  power  and  ultra-low  read  noise  of  CMOS  readout  technology  (Liu,  Fowler,  Onishi,  Vu, 
Wen,  Do,  and  Horn,  2005).  The  FPA  has  two  components:  two  CMOS  readout  integrated  circuits 
(ROIC)  and  a  CCD  imaging  substrate  (see  Figure  9).  This  has  been  used  in  a  LLL  camera. 


CMOS  ROIC 


CCD  imaging 
section 

CCD  storage 
section 


Figure  9:  Prototype  CCD/CMOS  hybrid  FPA  and  low  level  light  camera  (Liu  et  al, 

2005) 

The  above  architecture  eliminates  the  slow  speed,  high  noise,  and  high  power  limitations  of  a 
conventional  CCD  which  would  result  in  a  compact,  low  power,  ultra-sensitive  solid-state  FPA  that 
can  be  used  in  low  light  level  applications.  Some  applications  identified  by  Fairchild  Imaging 
include:  live-cell  microscopy  and  security  cameras  at  room  temperature  operation.  The  prototype 
FPA  has  a  1280  x  1024  format  with  12-pm  square  pixels. 

4. 1. 1. 7  Colour  CMOS  Cameras 

The  loss  of  situational  awareness  with  monochrome  night  vision  cameras  and  sensors,  has  led  to 
the  development  of  colour  night  vision  systems  - .  These  systems  are  sensitive  to  the  visible  to 
near-infrared  (VNIR)  portion  of  the  spectrum.  The  systems  display  a  rendition  of  the  “colours” 
that  would  be  seen  by  the  observer  in  daylight  conditions  -  see  Figure  10.  The  literature  review 
identified  a  number  of  colour  night  vision  cameras  and  goggles. 

Three  different  “True-color”  night  vision  approaches  to  providing  colour  are  available  from  the 
OKSI  Opto-Knowledge  Systems.  One  approach  utilizes  a  fast-switching  liquid  crystal  filter  in 
front  of  a  custom  Gen-III  image  intensified  CMOS  camera,  while  the  second  is  based  around  an 
EMCCD  sensor  with  a  mosaic  filter  applied  directly  to  the  detector.  The  third  approach  utilizes  an 
ICMOS  camera  with  Liquid  Crystal  Filter  -  see  Figure  11. 
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Figure  10:  Monochrome  and  color  low-light-level  imagery  (from 
http://www.techexpo.com/WWW/opto-knowledge) 


Figure  11:  True-color  night  vision  camera  (ICMOS  camera  with  Liquid  Crystal  Filter) 
(from  http://www.techexpo.com/WWW/opto-knowledqe) 

The  Tenebraex  Coiporation  has  another  approach  to  colour  night  vision.  They  have  two  helmet 
mounted  models  at  the  final  preproduction  stage.  The  color  products  are  called  the  ColorPath™ 
CCNVD  (Color  Capable  Night  Vision  Device)  -  see  Figure  12.  It  uses  a  standard,  green  image 
intensifier  tube  and  a  mechanical  filter.  Tenebraex  reports  that  “the  CCNVD  can  generate  a  color 
image  down  to  quarter-moon  light  levels.  At  lower  light  levels,  with  the  Model  OP,  a  simple  twist 
of  a  knob  moves  the  ColorPath  technology  from  the  optical  path,  leaving  the  user  with  a  standard, 
monochromatic  green  night  vision  device  with  all  the  overcast  moonless  night  performance  that  he 
had  before.” 
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Figure  12:  ColorPath  CCNVD,  Model  MC  (from 
http://camouflage.com/colornightvision.php) 

The  resolving  performance  of  colour  night  vision  goggles  and  cameras  is  currently  not  as  good  as 
dedicated  monochrome  systems.  While  manufacturers  are  currently  improving  their  systems,  they 
are  not  mature  or  capable  enough  to  consider  for  SIHS.  By  the  time  of  ISSP  Build  #2,  the  systems 
may  be  potential  candidates. 

4.1. 1.8  Thermal  Light  Valve  (TLV)  CMOS  Cameras 

A  new  development  in  thermal  imaging  is  the  use  of  a  passive  optic  filter  which  translates  thermal 
radiation  into  light  which  is  imaged  by  a  standard  CMOS  camera.  Unlike  other  thennal 
technologies  which  use  microbolometers,  QUIPs,  etc  this  system  uses  relatively  simple 
technologies.  This  technology  was  first  demonstrated  in  the  laboratory  by  Aegis  Semiconductor  in 
2004,  their  spin  off  company  RedShift  Systems  is  now  beginning  to  market  the  technology. 

According  to  Redshift  Systems’  website  (From  http://www.redshiftsystems.com 
/site/ImagingTechnology/ThermalLightValve)  the  core  of  their  technology  is  Thermal  Light  Valve 
(TLV)  -  see  Figure  13.  The  “TLV  is  a  tunable  filter  composed  of  pixels  standing  on  thermally 
isolating  posts  on  an  optically  reflective  and  thermally  conductive  substrate.  Each  pixel  acts  as  a 
passive  wavelength  converter.  Using  standard  thermal  optics,  long-wavelength  infrared  (LWIR) 
radiation  from  the  scene  is  imaged  onto  and  absorbed  by  the  TLV.  This  heats  up  select  thermal 
pixels  on  the  array  in  direct  relation  to  the  thermal  signature  of  the  scene.  The  minimum  reflective 
wavelengths  of  the  pixels  shift  based  upon  the  thennal  energy  incident  on  each.  A  narrow-band 
near-infrared  (NIR)  light  source  is  used  to  “probe”  the  temperature  of  the  pixels  across  the  TLV. 
This  NIR  probe  signal  is  reflected  off  the  TLV  in  varying  amounts,  depending  on  the  pixel 
temperature,  onto  the  CMOS  imager.  The  intensity  of  the  light  received  by  the  CMOS  imager  is 
therefore  “modulated”  by  the  heat  signature  of  the  scene.  A  thermal  image  is  obtained  by 
measuring  the  pixel-to-pixel  variation  in  transmission  of  the  NIR  probe  signal  using  CMOS 
imagers.” 
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Figure  13:  Depiction  of  the  Thermal  Light  Valve  (from 
http://www.redshiftsystems.com) 


While  the  core  of  Redshift’s  technology  is  the  TLV,  the  system  requires  a  CMOS  sensor,  a  laser 
diode,  lenses  and  a  video  processing  board.  OpTIC  is  RedShift’s  brand  name  for  its  Optical 
Thermal  Imaging  Camera  engines  -  see  Figure  14. 


Video  Processing  Board 
based  on  Consumer  DSP 


Figure  14:OpTIC  camera  (from 

http://www.redshiftsvstems.com/site/lmaqinqTechnoloqy/CameraEnqines) 

Currently  OpTIC  engines  are  limited  to  160x120  resolution.  While  the  performance  of  thermal 
sensors  based  upon  OpTIC  have  not  been  identified  in  the  open  literature,  the  scalability,  low  cost, 
low  power  and  potential  performance  may  make  this  technology  suitable  for  ISSP. 

4. 1.1. 9  Short  Wavelength  Infrared  (SWIR)  Sensors 

The  Short  Wave  Infrared  (SWIR)  spectrum  covers  the  1.1  to  2.5  pm.  Typical  applications  include 
pharmaceutical,  medical  diagnostics,  food  and  quality  control. 

A  number  of  light  weight  SWIR  sensors  are  currently  available  as  COTS  items.  Sensors  Unlimited 
has  produced  several  light  weight  SWIR  systems.  An  example  is  their  SU320KTX  (see  Figure  15) 
uses  indium  gallium  arsenide  (InGaAs)  technology  and  is  being  used  in  the  US  SMaRTS  (Soldier 
Mobility  and  Rifle  Targeting  System)  and  in  the  DARPA  MANTIS  (Multi-Spectral,  Adaptive, 
Networked  Tactical  Imaging  System)  project. 
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Figure  15:  SU320KTX  SWIR 

SWIR  detectors  are  being  developed  with  upper  limits  extending  to  2200  nanometers.  These 
systems  are  being  developed  for  extended  wavelength  range  hyperspectral  imaging  for  more 
effective  camouflage  detection  and  identification.  These  2200  nanometer  wavelength  devices  will 
also  go  into  long-wave  LADAR  systems  using  1.95  micrometer  wavelength  lasers  (Angel  et  al, 
2007). 

Specially  processed  InGaAs  SWIR  detectors  are  being  developed  to  cover  the  range  from  400-1700 
nanometers  (note  the  visible  spectrum  ranges  in  wavelength  from  .4  to  .7  pm  and  the  NIR  section 
typically  spans  0.7  -1.5  pm.).  These  shorter-wavelength  devices  enable  the  military  to  see  850 
nanometer  lasers  (AN/PAQ-4C,  etc.)  as  well  as  the  developmental  1.06  and  1.55  pm  lasers,  along 
with  the  visible  image  (day)  of  the  target  being  illuminated. 

SWIR  imagers  are  being  investigated  as  possible  replacements  for  LLLTV,  ICMOS  or  NVG 
systems.  The  image  appears  as  a  gray  scale  picture  -  see  Figure  16. 


Figure  16:  SWIR  camera  image 

Annex  A  contains  a  list  of  ten  SWIR  cameras/sensors  that  would  be  suitable  for  the  SIHS 
application.  From  these  products  identified,  there  were  four  different  manufacturers:  FLIR, 
Intevac,  Sensors  Unlimited  Inc.,  and  Lumitron. 

4.1.1.10  Mid  Wavelength  Infrared  (MWIR)  and  Long  Wavelength  Infrared  (LWIR) 
Sensors 

These  “thermal”  cameras  typically  cover  ranges  in  the  electromagnetic  spectrum  from  3  to  5  pm 
(MWIR)  and  8  to  12  pm  (LWIR).  MWIR  and  LWIR  sensors  have  many  industrial  as  well  as 
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military  applications.  Industrial  applications  include:  wireless  communications,  spectroscopy, 
weather  forecasting,  and  astronomy;  and  military  applications  include:  target  acquisition,  tracking, 
and  surveillance. 

Northrop  Grumman  Electro-Optical  Systems  (NGEOS)  has  focused  in  recent  years  on  the 
development  of  enhanced  night  vision  goggles  (ENVG)  systems.  In  2003,  they  developed  an  NVG 
with  the  capability  of  producing  real-time  image  fusion  from  an  I2  sensor  and  an  uncooled  LWIR 
sensor  concentrating  on  both  optical  overlay  and  digital  image  fusion.  This  technology  allows  for 
optimum  imaging  in  battlefield  obscured  and  laser  polluted  environment  (Estrera,  Ostromek,  Isbell, 
and  Bacarella,  2003). 

In  general,  MWIR  and  LWIR  cameras  are  much  more  expensive  than  LLLTV  cameras.  However, 
MWIR  and  LWIR  cameras  generally  have  better  performance/detection. 

Annex  A  contains  a  list  of  25  MWIR  and  LWIR  cameras/sensors  that  would  be  suitable  for  the 
SIHS  application.  From  these  products  identified,  there  were  six  different  manufacturers:  DRS 
NVEC,  ELCAN  (Raytheon),  FLIR,  Irvine  Sensors,  L3  Communications  Thermal  Eye,  and 
Lumitron. 

4.1.2  Commercial  Fusion  Development  Efforts 

Along  with  the  continuous  development  of  their  own  sensors,  a  number  of  companies  are  currently 
developing  fusion  systems  for  commercial  and  military  applications.  One  of  our  research  team 
members  had  the  opportunity  to  attend  the  USMC  Systems  Command  Infantry  Weapons  Systems 
Product  Group  13  Optics  and  Non-Lethal  Systems  briefing  to  industry  on  24  October  2006. 
Numerous  key  industry  players  were  in  attendance  to  present  their  products  and  future  plans  for 
sensor  fusion.  Table  4  highlights  companies  and  the  sensors  they  plan  to  fuse.  For  example, 
Northrop  Grumman  plans  to  fuse  LWIR  and  I2CMOS. 


Table  4:  Future  sensor  fusion  types 
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The  majority  of  the  current  research  is  on  the  fusion  of  just  two  sensors.  There  were  also 
presentations  regarding  fusion  of  three  or  more  sensors.  In  particular,  Optics  1  is  planning  to  fuse 
thermal  with  NIR  and  SWIR.  Northrop  Grumman  plans  to  eventually  fuse  sensors  using  3  channel 
fusion  for  visible/NIR,  SWIR,  and  LWIR. 
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4.1.3  Fusion  Processing  System 

In  addition  to  sensor  cameras,  a  fusion  system  requires  a  number  of  hardware  components,  they 
include  frame  grabbers  or  digital  input  cards;  raw  image  processing  cards  (waiping  for  registration, 
noise  cleaning,  contrast  enhancement,  and  adaptive  dynamic  range  compression,  etc.),  fusion 
processing  card,  data  input  card,  display  card,  host  card  etc.  The  system  developed  by  Fay  et  al. 
(2000)  for  their  colour  fusion  study  (2000)  utilized  a  number  of  electronic  cards  and  boards-  see 
Error!  Reference  source  not  found.. 


Figure  17:  Fusion  processing  system  utilized  by  Fay  et  al.  (2000) 

Fusion  processing  systems  can  de  developed  using  readily  available  PCI-based  video  processing 
boards,  frame  grabbers,  backplane  mother  boards,  Video  Graphics  Array  (VGA)  adapter  boards, 
etc.  Another  approach  is  to  develop  stand  alone  Digital  Signal  Processor  (DSP)  systems.  DSP 
systems  require  the  need  to  develop  drivers  for  frames  grabber,  display,  etc.  Hines,  Rahman, 
Jobson,  and  Woodell  (2006,  June)  utilized  a  single  TI  DM642  digital  signal  processor  for  the 
fusion  system  developed  for  their  Enhanced  Vision  System  (EVS). 


4. 1.3. 1  Dedicated  Fusion  Board 

Another  approach  for  the  SIHS  Vision  SST  in  developing  a  fusion  processing  system  is  to  utilize 
dedicated  COTS  fusion  boards.  The  following  criteria  were  developed  for  selecting  a  stand  alone 
fusion  board: 

•  Able  to  handle  up  to  four  sensors  (digital  sources  TBC); 

•  Real  time  fusion; 

•  Open  architecture  to  implement  algorithms  of  choice;  and 

•  Small  form  factor. 

Many  of  the  boards  that  were  identified  in  the  preliminary  search  were  too  large  or  bulky  for  the 
SIHS  TD  purpose.  In  addition,  many  of  the  fusion  boards  possessed  proprietary  or  single  source 
fusion  algorithms.  A  total  of  seven  fusion  boards  were  identified  as  candidates,  however  only  two 
met  the  desired  characteristics  for  SIHS  TD. 

Equinox  Corporation  has  developed  a  line  of  image  fusion  products.  The  concept  is  a  single 
unified  video  image  fusion  device  that  can  centrally  interface  with  a  variety  of  input  cameras  and 
output  displays,  together  with  a  suite  of  algorithms  that  support  image  fusion  across  different 
combinations  in  the  spectrum.  These  devices  are  small  in  size,  lightweight  and  have  relatively  low 
power  consumption.  The  key  issues  for  practical  field  usage  are  how  to  effectively  visualize  two 
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complementary  modalities  at  video  rates  with  sufficiently  low  power  consumption  and  a  small 
form  factor  (Wolff,  Socolinsky,  and  Eveland,  2006). 

Figure  18  shows  Equinox’s  DVP-4000  hardware  for  image  fusion  of  two  inputs.  They  have 
implemented  a  visually  intuitive  computational  image  fusion  algorithm  with  ancillary 
computational  features  such  as  non-linear  image  modality  co-registration  and  automatic  gain 
control  (AGC)  onto  a  compact  board. 


Imaging 
Modality  #1 

Imaging 
Modality  #2 


Fused 

Video 


Figure  18:  Equinox’s  DVP-4000  dual  video  processing  board  (Wolff  et  al,  2006) 

Similar  to  Equinox’s  DVP-4000  is  Imagize’s  FP-3500.  It  is  Imagize’s  smallest  board  and  has  lower 
power  requirements  then  e  Equinox’s  model.  The  FP-3500  is  able  to  fuse  input  images  of  different 
sizes  and  produce  a  high  resolution  (1600  x  1200)  output  image.  The  Equinox  Corporation  and 
Imagize  seem  to  dominate  the  real-time  image  fusion  processor  industry.  The  Equinox  models  use 
an  open  source  for  the  input  of  algorithms  developed  by  Waterfall  Solutions  (Surrey,  England). 

The  Imagize  model  uses  a  closed  system  algorithm  approach  and  uses  algorithms  based  on 
biological  vision  systems  but  fail  to  disclose  which  algorithms.  Octec  Image  Processing  produces  a 
video  tracker  that  is  capable  of  fusing  videos  and  contains  multiple  analog  video  and  digital  video 
outputs,  as  well  as,  multiple  analog  outputs  with  the  ability  of  integrating  multiple  algorithms.  For 
desktop  and  open  source  applications  the  ADEPT60  from  Octec  appears  to  be  the  primary  image 
processor  used  in  literature  -  See  Figure  1 9  .  There  are  several  other  companies  that  develop  frame 
grabbers  that  are  able  to  select  certain  frames  from  the  input  videos  where  they  are  then  passed  on 
to  the  fusion  process. 


Figure  19:  Octec’s  ADEPT60  automatic  video  tracker/  image  processor 
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4. 1.3. 2  Purpose-Built  Fusion  System 

If  the  capabilities  of  COTS  fusion  systems  cannot  meet  the  needs  of  the  SIHS  Vision  SST,  then  a 
purpose-built  system  could  be  constructed.  The  system  developed  by  Fay  et  al.  (2000)  for  their 
colour  fusion  study  utilized  two  Matrox  Corp.  Genesis  main  boards  and  two  Genesis  co-processor 
boards,  in  an  industrial  PC  rack-mount  chassis,  with  a  Pentium  II  host  processor  card.  A  system 
developed  today  could  utilize  a  significantly  faster  processor. 

If  the  power  of  a  Matrox  Genesis  system  is  not  required  (up  to  100  billion  operations  per  second) 
then  another  approach  would  center  on  a  powerful  processor  and  COTS  frame  grabbers.  The 
requirement  to  use  an  EBX  or  PC/104  or  PC/104  Plus  minimodule  computer  format  is  not  believed 
to  be  required.  The  Vision  SST  fusion  test  bed  is  primarily  for  video  and  fusion  image  collection 
and  will  not  be  configured  into  a  man  portable  system. 

4.1.3.2.1  Frame  Grabbers 

A  frame  grabber  is  a  board  that  can  be  plugged  into  a  computer  that  will  capture  an  analog/digital 
signal  and  digitize  it  so  that  a  single  frame  or  multiple  frames  can  be  extracted.  It  is  a  critical  piece 
of  hardware  when  select  frames  of  two  separate  analog/digital  input  signals  are  fused.  Once  the 
frame  grabber  digitizes  the  signals  and  the  frames  are  selected,  they  are  passed  to  the  fusion 
processor  where  it  undergoes  the  fusion  process  before  it  is  sent  to  the  display  unit.  There  are  many 
different  manufacturers  of  frame  grabbers  and  only  a  select  few  companies  are  presented  here.  The 
most  prevalent  companies  include  Matrox,  Sensoray,  Alacron,  Matrix,  PixelSmart,  Epix  and 
BitFlow.  Frame  Grabbers  are  used  to  digitize  analog/digital  input  signals.  The  application 
determines  whether  or  not  a  frame  grabber  is  necessary.  If  the  application  warrants  that  certain 
frames  of  the  input  signals  are  fused  and  the  resulting  image  be  evaluated  then  a  frame  grabber  is 
necessary  for  this.  However,  if  the  requirement  is  to  monitor  a  continuous  real-time  video  of  fused 
input  videos  than  a  frame  grabber  is  not  necessary.  Once  it  is  determined  that  a  frame  grabber  is 
necessary  for  a  certain  application  then  all  the  various  frame  grabbers  need  to  be  evaluated  so  that 
the  appropriate  frame  grabber  can  be  selected. 
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Table  5  describes  a  number  of  frame  grabbers  manufactured  from  these  companies.  By  no  means 
does  this  include  the  array  of  frame  grabbers  available  in  the  marketplace  today.  It  is  important  to 
note  that  there  are  separate  frame  grabber  models  for  different  types  of  manufactured  cameras  and 
careful  consideration  is  needed  when  selecting  the  appropriate  frame  grabber. 

Frame  Grabbers  are  used  to  digitize  analog/digital  input  signals.  The  application  determines 
whether  or  not  a  frame  grabber  is  necessary.  If  the  application  warrants  that  certain  frames  of  the 
input  signals  are  fused  and  the  resulting  image  be  evaluated  then  a  frame  grabber  is  necessary  for 
this.  However,  if  the  requirement  is  to  monitor  a  continuous  real-time  video  of  fused  input  videos 
than  a  frame  grabber  is  not  necessary.  Once  it  is  determined  that  a  frame  grabber  is  necessary  for  a 
certain  application  then  all  the  various  frame  grabbers  need  to  be  evaluated  so  that  the  appropriate 
frame  grabber  can  be  selected. 
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Table  5:  Frame  grabber  results 


HUMAN  SYSTE MS 


Model 

Outputs 

Inputs 

Resolution 

Acquisition  Rate 

PixelSmart  512-8 

Composite  RGB 

Multiple  NTSC,  PAL, 
LVDS,  RS422 

640x480,  512x480 

NA 

Sensoray  512 

PAL  or  NTSC 

Multiple  2  video  or  4 
composite 

640x480  NTSC  768x576 
PAL 

25  -  30  frames/s 

Sensoray  516 

PAL  or  NTSC 

Multiple  2  video  or  4 
composite 

Input  704x480-NTSC/ 
704x576-PAL  Output 
768x576-PAL/  704x480- 
NTSC 

25  -  30  frames/s 

Alacron  FFRAME-CB 

Colour 

1  Digital  1 

Analog 

NA 

27  MHz 

Alacron  FAST-X 

Not  Colour 

6  Digital  Camera 

Links 

NA 

Alacron  FAST-UXGA 

Not  Colour 

4  UXGA  Four 
Analog  Channels 

NA 

205  MHz 

EPIX  PIXCI-D 

NO 

LVDS/RS422 

IKxlK 

NA 

MATROX  Titlemotion 

VGA 

NTSC-PAL 

NTSC  Full  Frame 

NA 

MATROX  Meteor-1 1 

Standard  and  Non- 
Standard  analog 
Monochrome  or 
component  RGB 

NA 

NA 

Up  to  30  MHz 

MATROX  Helios 
eA/XA 

Standard  and  Non- 
Standard  analog 
Monochrome  or 
component  RGB 

NA 

NA 

Up  to  160  MHz 

MATROX  Vio 

HD  (720p  or  1 080i)  or 
SD 

Analog  including 
component  RGB 
Optional  SDI 

NA 

NA 

CCIR-601  for  HD, 

SD 

4.2  Fusion  Algorithms 

A  literature  search  was  conducted  based  on  the  search  parameters  given  in  Table  1.  Based  on  the 
results  of  the  literature  search  the  following  algorithms  were  identified  for  their  use  in  image 
fusion:  Principal  Component  Analysis,  Discrete  Wavelet  Transform,  Wavelet  Transforms,  Shift- 
invariant  Discrete  Wavelet  Transforms,  Laplacian  Pyramid,  Simple  and  Weighted  Average, 
Gradient  Pyramid,  Contrast  Pyramid,  Morphological  Pyramid,  Ratio  of  Low-Pass  Pyramids, 
Intensity-Hue- Saturation,  Advanced  Discrete  Wavelet  Transform,  Edge  Detection,  Brovey 
Transform,  Filter  Subtract  Decimate,  Hermite  Transform,  Principal  Component  Analysis  with 
Wavelet  Transform,  Finite  Ridgelet  Transform,  Contourlet  Transform,  Dynamic  Contour,  Samoff  s 
Feature  Level,  and  Decision  Level.  These  algorithms  were  identified  through  the  search  of 
approximately  40  articles  and  by  no  means  include  all  of  the  available  algorithms  used  for  image 
fusion  but  do  include  the  most  prevalent  algorithms  in  the  literature. 

The  list  of  algorithms  identified  through  literature  were  down  selected  based  on  times  cited, 
availability  of  information,  and  the  applicability  to  night  vision  image  fusion.  The  selected 
algorithms  used  for  this  report  include  Principle  Components  Analysis,  Simple  Averaging, 
Laplacian  Pyramids,  Morphological  Pyramids,  Gradient  Pyramids,  Ratio  of  Low-Pass  Pyramids, 
Wavelet  Transforms  including  the  Discrete  Wavelet  Transform  and  the  Shift-invariant  Wavelet 
Transform,  and  Edge  Detection. 
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4.2.1  Fusion  Algorithms  Background 

In  its  simplest  form,  an  algorithm  is  a  procedure  for  accomplishing  a  task  where  a  given  initial  state 
will  go  through  a  set  of  procedures  and  terminate  in  pre-defmed  end-state.  In  the  case  of  Image 
Fusion,  the  algorithm  defines  which  processes  the  initial  images  go  through  in  order  to  end  with  a 
fused  image  incorporating  all  of  the  necessary  information  present  in  the  initial  images. 

Over  the  years  there  has  been  numerous  image  fusion  algorithms  developed  to  address  the  growing 
need  for  image  fusion.  The  algorithms  can  be  roughly  divided  into  two  groups;  multiscale- 
decomposition  (MSD)-based  fusion  methods,  and  non-multiscale-decomposition  (NMSD)-based 
fusion  methods  (Blum  &  Liu,  2006).  The  basic  idea  of  a  MSD  based  fusion  method  is  that  a 
multiscale  transform  is  performed  on  the  source  images,  and  then  a  composite  multi-scale 
representation  of  these  images  is  constructed  based  on  a  predetermined  selection  rule.  The  fused 
image  is  obtained  by  taking  the  inverse  of  the  original  multiscale  transform  (Blum  &  Liu,  2006). 
The  most  common  MSD  methods  include  pyramid  transforms  and  wavelet  transforms  (WT).  All 
NMSD  are  not  based  on  multi-scale  transforms.  Most  common  NMSD  fusion  methods  include, 
Principal  Component  Analysis  (PCA),  Weighted  Average  technique,  Estimation  Theory  methods, 
and  Artificial  Neural  Networks.  Image  fusion  techniques  can  also  be  classified  based  on  the  level 
of  processing  where  the  fusion  takes  place.  There  are  three  main  levels  where  image  fusion  may 
take  place  and  they  include: 

•  Pixel  Level; 

•  Feature  Level;  and 

•  Decision  Level. 

For  the  purposes  of  this  report  the  fusion  algorithms  will  be  classified  based  on  the  level  of  where 
the  fusion  processing  takes  place.  Therefore  under  the  classification  of  Pixel  Level  Fusion,  the 
following  algorithms  will  be  discussed  in  more  detail:  Simple  Averaging  technique,  PCA,  Pyramid 
based  fusion  schemes,  and  wavelet  transforms.  Under  the  classification  of  Feature  level  fusion  we 
will  discuss  the  edge  detection  algorithm  and  Decision  Level  fusion  will  be  briefly  described  but  no 
specific  algorithms  will  be  included  due  to  lack  of  literature  present  in  the  use  of  Decision  Level 
fusion  algorithms  for  the  purpose  of  image  fusion. 

4.2.2  Pixel  Level  Image  Fusion 

Image  fusion  at  the  pixel  level  means  fusion  at  the  lowest  processing  level  referring  to  the  merging 
of  the  physical  parameters  of  the  source  images  (Pohl  &  Van  Genderen,  1998).  Among  the  three 
fusion  levels,  pixel  level  fusion  is  the  most  mature  and  encompasses  the  majority  of  image  fusion 
algorithms  in  the  literature  today.  Figure  20  illustrates  a  schematic  of  the  pixel  level  fusion  process. 
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Result 


Figure  20:  Schematic  of  Image  Level  Fusion  (from  Pohl  &  Van  Genderen,  1998) 

All  input  images  are  aligned  first  and  then  the  algorithm  is  performed  across  the  pixels  of  all  the 
input  images.  Therefore,  to  perform  pixel  level  fusion  all  input  images  need  to  be  spatially 
registered  exactly  to  all  other  input  images,  so  that  all  pixel  positions  of  all  the  input  images 
correspond  to  the  same  location  in  the  real  world  (Rockinger,  1996).  There  can  be  some  generic 
requirements  imposed  on  the  fusion  result  from  pixel  level  fusion: 

•  The  fusion  process  should  preserve  all  relevant  information  on  the  input  imagery  in  the 
composite  image  (pattern  conservation); 

•  The  fusion  scheme  should  not  introduce  any  artefacts  or  inconsistencies  which  would 
distract  the  human  observer  or  following  processing  stages;  and 

•  The  fusion  scheme  should  be  shift  and  rotational  invariant,  i.e.  the  fusion  result  should 
not  depend  on  the  location  or  orientation  of  an  object  in  the  input  imagery.  (Rockinger, 
1996) 


The  remainder  of  this  section  will  focus  on  the  most  common  pixel  level  fusion  algorithms.  It  will 
begin  with  a  simple  averaging  technique,  followed  by  principle  components  analysis,  pyramid 
fusion  schemes  (Laplacian,  Morphological,  Gradient,  and  Contrast),  and  wavelet  transforms 
(Discrete  Wavelet  Transform  and  Shift  Invariant  Discrete  Wavelet  Transform). 

4. 2. 2. 1  Simple  A  veraging  Technique 

Averaging  techniques  used  for  image  fusion  are  the  most  basic  and  simplest  techniques  that  are 
used.  It  works  by  simply  taking  the  average  intensity  value  of  the  various  input  images  pixel  by 
pixel  (Li,  Manjunath,  and  Mitra  1995).  The  averaging  technique  allows  you  to  vary  the  weight  that 
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each  input  image  has  on  the  resulting  fused  image.  Instead  of  having  each  input  image  contributing 
the  same  amount  towards  the  fused  image  you  can  have  one  input  image  contributing  more  to  the 
fused  image  based  on  a  pre-selected  rule.  For  instance,  when  fusing  thermal  and  I2  sensors  you  may 
assign  larger  weights  for  the  warmer  or  cooler  pixels  of  the  thermal  image  or  assign  larger  weights 
to  those  pixels  whose  intensities  are  much  different  from  its  neighbours  (Blum  &  Liu,  2006).  A 
disadvantage  of  the  averaging  technique  is  that  if  an  object  appears  in  a  certain  contrast  from  one 
sensor  and  appears  in  the  opposite  contrast  in  the  other  sensor  the  fusion  process  will  effectively 
cancel  out  the  object  in  the  fused  image  (Fechner  &  Godlewski,  1995).  No  matter  how  the 
weighting  coefficients  are  determined,  pixels  from  input  images  with  high  contrast  values  will  be 
depressed  in  the  composite  fused  image.  This  is  detrimental  if  the  object  of  interest  has  a  high 
contrast  value  in  one  of  the  input  images.  Even  though,  this  can  have  negative  effects  with  respect 
to  target  detection  and  recognition  it  is  the  simplest  fusion  scheme  and  is  typically  used  as  a 
benchmark  for  all  other  fusion  schemes  (Lanir,  2005). 


4.2.2.2  Principal  Components  Analysis  (PC A) 

As  opposed  to  the  previous  method  where  weighting  coefficients  of  each  input  image  is  pre¬ 
selected,  optimal  weighting  coefficients  with  respect  to  information  content  in  the  input  images  and 
the  ability  to  remove  redundancy  in  the  input  images,  can  be  determined  by  a  principal  components 
analysis  (PCA).  PCA  is  a  method  of  finding  patterns  in  data  of  high  dimensions  and  compressing 
the  data  into  a  more  manageable  form  by  reducing  the  number  of  dimensions  without  much  loss  of 
information  (Smith,  2002).  The  rest  of  this  section  will  provide  a  brief  mathematical  description  of 
PCA  and  then  it  applications  to  image  fusion,  advantages  and  disadvantages  of  PCA  for  image 
fusion,  followed  by  previous  studies  that  have  measured  the  effectiveness  and  quality  of  fused 
images  using  PCA. 


The  mathematical  explanation  of  PCA  will  not  go  into  great  detail  in  the  derivation  and 
formulation  of  this  method.  For  example,  if  you  have  a  data  set  of  two  variables,  the  first  step  is  to 
subtract  the  mean  of  each  variable  from  all  of  the  data  points  from  that  variable  which  will  leave 
you  an  adjusted  data  set.  The  next  step  is  to  calculate  the  covariance  of  the  two  variables  and  place 
the  values  into  a  covariance  matrix.  If  you  begin  with  a  data  set  of  two  variables  you  will  have  a  2 
x  2  covariance  matrix  and  a  3  x  3  matrix  if  you  began  with  three  variables.  The  next  step  is  to 
calculate  the  eigenvectors  and  the  Eigen  values  for  your  covariance.  Without  going  into  detail 
about  eigenvectors  and  Eigen  values,  you  will  get  a  matrix  with  the  same  dimensions  as  your 
covariance  matrix.  The  corresponding  Eigen  value  with  be  in  one  column  with  each  row 
representing  the  Eigen  value  for  the  corresponding  column  in  the  eigenvector  matrix.  For  example, 
the  Eigen  value  of  1.28402771  represents  the  Eigen  value  associated  with  the  2nd  column  of  the 
eigenvector  matrix. 


eigemvilwes 


.0490833989  V 
1.28402771  ) 


eigenvectors 


(  -.735178656  -.077*73399  V 
y  .677873399  —.735178650  ) 


The  eigenvectors  represent  information  about  the  patterns  with  the  given  variables.  The  eigenvector 
that  is  associated  with  highest  Eigen  value  represents  the  vector  in  the  data  that  provides  the 
strongest  pattern  or  relationship  amongst  the  original  data  set.  You  are  then  able  to  compress  your 
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original  data  set  by  choosing  the  eigenvector  with  the  highest  Eigen  value,  which  is  known  as  the 
principle  component.  If  you  had  a  data  set  of  20  variables  you  would  be  able  to  compress  it  to  15 
variables  by  choosing  the  highest  15  Eigen  values.  To  get  to  final  data  set  you  would  multiply  the 
transpose  of  the  eigenvector  by  the  transpose  of  the  original  data  set.  This  will  give  the  original 
data  in  terms  of  vectors  and  these  vectors  describe  the  patterns  within  the  data  set.  The  whole 
process  of  PCA  is  to  transform  the  data  so  it  can  be  expressed  in  terms  of  patterns  and  decompose 
the  data  into  vectors  that  describe  the  greatest  contribution  to  the  patterns  within  the  data  set. 

This  relates  to  image  fusion  because  each  image  can  be  seen  as  a  variable  in  a  data  set.  For  two 
images  with  N  x  M  pixels  you  will  have  a  NM  matrix  with  2  dimensions  with  each  vector 
containing  the  intensity  level  from  the  same  pixel  from  each  individual  picture.  A  PCA  is  then 
performed  on  this  data  set  and  the  highest  Eigen  value  is  selected  in  order  to  compress  the  data  into 
a  single  dimension.  Instead  of  subtracting  the  mean  of  each  input  image  from  the  image,  each  input 
image  is  filtered  and  those  filtered  images  are  subtracted  from  the  original  images  (Chari,  Fanning, 
Salem,  Robinson,  and  Halford  2005).  The  first  eigenvector  usually  contains  more  than  90%  of  the 
information  present  in  all  of  the  original  images  (Senthil  &  Muttan,  2006).  The  picture  will  be  of 
less  quality  than  any  of  the  originals  because  you  are  only  selecting  the  highest  Eigen  value  and 
therefore  some  of  the  patterns  between  the  original  images  are  lost  (Smith,  2002).  In  order  for  PC  A 
to  be  used  effectively  there  needs  to  be  a  strong  correlation  between  the  original  image  data  and  the 
fused  image  data,  and  sometimes  this  is  not  the  case  (Huihui,  Lei,  and  Hang,  2005). 

The  main  advantage  of  PCA  is  that  you  are  able  to  have  a  large  number  of  inputs  and  that  most  of 
the  information  within  all  the  inputs  can  be  compressed  into  a  much  smaller  amount  of  outputs 
without  much  loss  of  information  (Senthil  &  Muttan,  2006).  One  disadvantage  of  the  use  of  PCA 
for  image  fusion  is  that  you  are  selecting  only  the  first  eigenvector  to  describe  your  data  set.  Even 
though  this  eigenvector  contains  90%  of  the  shared  information  there  is  still  some  information  that 
will  not  be  evident  in  the  final  fused  image. 

PCA  performs  well  when  compared  to  other  image  fusion  algorithms.  In  a  study  by  Tsagaris  and 
Anastassopoulos  (2006),  PCA  was  superior  to  the  simple  averaging  technique  and  the 
Morphological  pyramid  algorithm  for  the  majority  of  the  measures  and  only  being  inferior  to  the 
discrete  wavelet  transform.  In  a  study  that  measured  detection  of  various  targets,  PCA  was  superior 
to  the  simple  averaging  and  edge  detection  methods  and  also  performed  favourably  when  target 
detection  time  was  taken  into  consideration. 

4.2.3  Pyramid  Based  Fusion  Schemes 

All  pyramid  based  fusion  schemes  follow  the  same  basic  process.  An  image  pyramid  can  be 
described  as  a  sequence  of  images  where  each  image  is  constructed  by  low  or  band-pass  filtering 
the  previous  image  and  reducing  its  sample  density  (Zheng,  Essock,  and  Hansen,  2005).  Typically, 
the  reduction  in  sample  density  is  by  a  factor  of  2  so  that  each  successive  image  representation  is 
halved  in  both  spatial  densities  (Rockinger,  1998).  The  fused  image  is  derived  by  using  a  pre¬ 
determined  selection  rule  for  each  level  of  the  pyramid.  Once  a  fused  pyramid  representation  is 
developed  the  inverse  function  of  the  pyramid  transform  will  produce  the  fused  image  (Zheng  et  al. 
2005).  The  method  used  to  filter  the  original  images  and  reconstruct  the  fused  image  from  the 
pyramid  levels  will  define  the  specific  type  of  pyramid  based  fusion  scheme.  Figure  2 1  illustrates 
the  successive  levels  of  a  pyramid  based  fusion  method  of  a  single  input  image.  At  each  level  of  the 
pyramid  the  image  is  down-sampled  and  specific  frequencies  are  filtered  out  from  the  previous 
image. 
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4.2.3. 1  Laplacian  Pyramid  Algorithm  (LAP) 

The  Laplacian  Pyramid  Algorithm  (LAP)  is  the  most  frequently  studied  version  of  the  pyramid 
transform  (Blum  &  Liu,  2006).  Each  level  of  the  LAP  is  constructed  from  its  lower  level  by  four 
basic  procedures: 

■  Blurring  (low-pass  filtering); 

■  Sub-sampling  (reduce  size); 

■  Interpolation  (expand  in  size);  and 

■  Differencing  (subtract  two  images  pixel  by  pixel  (Blum,  2006)). 

The  down-sampling  of  the  image  is  by  a  factor  of  2,  which  means  keeping  one  sample  out  of  every 
2,  for  both  the  horizontal  and  vertical  directions  (Blum  &  Liu,  2006).  This  down-sampling  can  be 
achieved  due  to  the  reduction  in  the  spatial  frequency  content  due  to  the  low-pass  filtering  (Toet, 
1989).  The  up-sampling  procedure  inserts  a  zero  into  every  other  sample  in  both  the  horizontal  and 
vertical  directions.  After  the  differencing  procedure,  the  resulting  image  is  a  band-pass  filtered 
copy  of  its  predecessor  (Sadjadi,  2005). 

Once  the  pyramids  are  constructed  for  each  input  image  a  selection  method  is  used  to  decide  from 
which  source  what  pixels  are  used  to  contribute  at  each  level  of  the  pyramid  (Sadjadi,  2005).  A 
common  method  is  the  selecting  the  source  which  has  the  highest  contrast  feature  for  inclusion  into 
the  fused  image  (Bender,  Reese,  and  van  der  Wal,  2003).  The  inverse  pyramid  transform  merges  all 
the  collected  features  from  the  input  images  into  a  single  coherent  image  (Bender  et  al.  2003).  Due 
to  the  fact  that  this  rule  selects  the  input  image  with  the  highest  contrast  this  method  enables 
generally  higher  image  contrast  (Bender  et  al.  2003). 

LAP  performs  favourably  when  compared  to  other  fusion  methods  including  other  pyramid 
methods.  With  respect  to  entropy,  image  quality  index,  and  spatial  frequency,  LAP  ranked  higher 
than  4  other  pyramid  based  methods  when  fusing  night  time  imagery  where  brightness  and  contrast 
where  very  different  between  input  images  (Zheng  et  al.  2005).  Using  the  same  metrics  and  with 
input  images  with  similar  brightness  and  contrast  the  LAP  finished  2nd  amongst  the  pyramid  based 
fusion  methods  behind  the  Contrast  method,  which  will  be  discussed  later  (Zheng  et  al.  2005).  A 
group  of  human  observers  found  that  the  LAP  method  was  the  preferred  method  of  fusion,  with  the 
Shift  Invariant  Discrete  Wavelet  Transform,  when  examining  night  vision  images  (Chen  &  Blum, 
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2005).  This  study  looked  at  17  different  fusion  algorithms  and  along  with  the  subjective  tests  the 
LAP  finished  in  the  top  3  of  4  out  of  the  7  objective  tests  (Chen  &  Blum,  2005). 

A  disadvantage  of  the  LAP  is  the  fact  that  it  only  decomposes  images  by  a  factor  of  2,  which 
results  in  certain  amount  of  restriction  during  the  composition  of  the  fused  image  (Jishuang  & 

Chao,  2001).  Increasing  the  decomposition  levels,  by  methods  not  restricted  by  the  factor  of  2,  will 
drastically  increase  the  computational  demand  but  improve  the  quality  of  the  fused  image  (Jishuang 
&  Chao,  2001).  Image  fusion  methods  based  on  local  contrast  decomposition  do  not  distinguish 
between  material  edges  and  temperature  edges  which  could  cause  an  abundance  of  irrelevant 
information  even  though  there  is  an  enhancement  of  all  the  details  in  the  scene  (Toet  &  Franken, 
2003).  The  addition  of  irrelevant  information  may  clutter  the  scene  and  lead  to  misinterpretation  of 
perceived  detail  (Toet  &  Franken,  2003). 

4. 2. 3. 2  Morphological  Pyramid  Algorithm  (MORPH) 

Normal  filtering  techniques,  as  in  the  LAP,  usually  alters  the  details  of  shape  and  the  exact  location 
of  the  objects  in  the  image  (Sadjadi,  2005).  Morphological  pyramid  algorithms  (MORPH)  address 
this  issue  by  removing  image  details  without  any  negative  effects  or  without  adding  any  gray  scale 
bias  (Toet,  1989).  The  main  difference  between  the  MORPH  and  the  LAP  is  the  use  of 
morphological  pyramids,  based  on  a  different  filtering  method,  instead  of  Laplacian  pyramids  with 
simple  low-pass  and  band-pass  filters  it  uses  a  filtering  method  that  relies  on  shape  definition, 
extraction,  and  definition  (Sadjadi,  2005). 

Morphological  filters  are  sequences  of  morphological  operations  that  have  special  properties  with 
respect  to  shapes  in  the  image  (Toet,  1989).  They  can  be  used  to  ‘clean-up’  gray  scale  images  by 
choosing  a  structuring  element  that  is  larger  than  the  unwanted  details  in  the  image  (Ramac,  Uner, 
and  Varshney,  1998).  Morphological  filters  use  what  is  called  opening  and  closing  transformations. 
These  transformations  are  dual  operations,  in  that  what  one  does  to  the  image  foreground  the  other 
does  to  the  image  background  (Toet,  1989).  Sequentially  alternating  these  transformations  means 
that  the  background  and  foreground  are  treated  in  the  same  way  (Toet,  1989).  Morphological  filters 
can  also  extract  objects  of  a  certain  size  range  from  an  image  (Ramac  et  al.  1998). 

Once  filtered,  a  morphological  pyramid  can  be  formed  by  a  specific  sampling  method.  This  process 
is  similar  to  the  LAP.  Once  the  pyramids  are  formed  a  selection  rule  is  applied  and  a  composite 
image  is  formed.  The  fused  image  is  developed  by  the  inverse  pyramid  transform. 

The  non-linear  filtering  method  used  in  MORPH  was  thought  to  improve  performance  of  linear 
filters  (e.g.  LAP).  Based  on  the  work  by  Zheng  et  al.  (2005),  the  MORPH  method  actually  does  not 
perform  as  well  as  the  LAP  method  in  terms  of  the  subjective  tests  performed  (Entropy,  Image 
Quality  Index,  and  Spatial  Frequency)  for  both  the  night  vision  images  and  the  day  time  images. 
These  results  were  verified  in  the  work  by  Chen  and  Blum  (2005);  however  results  based  on  image 
entropy  found  no  correlation  to  subjective  scores  from  human  observers.  A  reason  that  the  MORPH 
method  may  score  high  in  the  subjective  tests  is  that  these  measures  are  sensitive  to  increases  in 
contrast  levels  and  is  sensitive  to  noise  and  other  dramatic  fluctuations  in  the  image  that  may  be 
caused  by  artefacts  and  algorithm  created  spots  (Chen  &  Blum,  2005).  The  MORPH  method  has 
also  proved  inferior  in  other  subjective  fusion  tests  when  compared  to  non-pyramid  based  fusion 
methods.  With  respect  to  the  Image  Fusion  Performance  Measure  (IFPM),  the  MORPH  method 
performed  poorly  when  compared  to  the  PCA  method,  the  discrete  wavelet  transform,  and  the 
simple  averaging  technique  (Tsagaris  and  Anastassopoulos,  2006). 
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4. 2. 3. 3  Gradient  Pyramid  Algorithm  ( GRAD) 

The  Gradient  Pyramid  Algorithm  (GRAD)  is  similar  to  the  other  pyramid  methods.  The  difference 
between  this  and  the  LAP  method  is  that  a  gradient  operator  is  applied  to  every  level  of  the 
pyramid  producing  horizontal,  vertical,  and  2  diagonal  pyramid  sets  for  each  pyramid  level  (See 
Figure  22),  as  compared  to  just  horizontal  and  vertical  pyramid  sets  in  the  LAP  method  (Zheng, 
Essock,  and  Hansen  2004).  The  rest  of  the  GRAD  follows  similar  steps  as  the  other  pyramid  based 
methods.  Once  the  pyramid  sets  are  developed,  a  certain  selection  and  match  criteria  is  enforced  for 
the  source  images,  based  on  the  results  of  the  selection  and  match  process  a  composite  image  is 
obtained,  and  the  inverse  pyramid  transform  will  produce  the  fused  image  (Wang,  Zhang,  Wang, 
and  Wang,  2005).  However,  due  to  the  fact  there  are  more  pyramid  sets  associated  with  the  GRAD 
there  are  several  intermediate  steps  that  are  taken  to  prepare  the  composite  image  for  fusion. 


Figure  22:  Four  Levels  of  the  GRAD  Pyramid  Set  (from  Sims  &  Phillips,  1997) 

Before  the  inverse  pyramid  transform  is  performed  to  compose  the  final  fused  image  the  4  gradient 
pyramids  need  to  be  converted  into  Laplacian  pyramids.  This  occurs  over  several  steps.  The  first 
step  is  to  convert  each  gradient  pyramid  level  to  a  corresponding  second  derivative  pyramid.  These 
pyramids  are  then  summed  to  form  a  filter  subtract  decimate  (FSD)  Laplacian  pyramid.  The  FSD 
Laplacian  pyramid  can  then  be  converted  to  a  composite  Laplacian  pyramid  which  then  can 
compose  the  fused  image  through  an  inverse  pyramid  transform  (Sims  &  Phillips,  1997). 

One  advantage  that  the  GRAD  has  over  the  LAP  is  that  it  has  an  improved  temporal  stability, 
indicated  by  a  reduced  conditional  entropy  measure  (Rockinger,  1998).  The  improved  temporal 
stability  comes  at  a  cost  of  visual  clarity  when  identifying  targets  (Sims  &  Phillips,  1997);  and 
shaipness,  when  compared  to  the  LAP  (Miao  &  Wang,  2006).  The  GRAD  also  has  the  advantage 
of  transferring  a  greater  amount  of  salient  information  from  the  input  image  to  the  fused  image 
when  compared  to  the  MORPH  method  but  not  as  much  as  the  LAP  (Chen  &  Blum,  2005). 


Page  32 


Sensor  Fusion  Literature  Review 


Humansy, stems  Incorporated 


HUMAN  S  Y STUMS 


4. 2. 3. 4  Ratio  of  Low-Pass  Pyramid  Algorithm  (RoLP) 

The  Ratio  of  low  pass  pyramid  algorithm  (RoLP)  judges  the  relative  importance  of  pattern 
segments  based  on  their  local  luminance  contrast  values  (Toet,  1989).  All  input  images  are 
decomposed  into  light  and  dark  blobs  on  decreasing  levels  of  resolution  (pyramids).  The  image  at 
each  level  of  the  pyramid  is  essentially  the  ratio  of  the  two  successive  levels  of  the  Gaussian 
pyramid  (Sims  &  Phillips,  1997).  The  composite  image  is  formed  by  selecting  at  each  pixel 
location  and  pyramid  level  the  largest  deviation  of  contrast  compared  to  the  input  images  (Sadjadi, 
2005).  These  pixels  are  used  to  form  the  composite  image.  The  fused  image  is  formed  by  the  same 
expand  and  add  procedure  used  in  the  LAP  (Sims  &  Phillips,  1997).  Originally,  the  RoLP  was 
explicitly  intended  for  use  by  human  observers  (Zheng  et  al.  2005).  Toet,  (1989),  claims  that  the 
LAP  is  not  a  faithful  representation  of  the  human  visual  system  because  it  only  accounts  for 
absolute  luminance  differences  whereas  the  RoLP  encodes  absolute  luminance  contrasts. 

In  objective  performance  measures  the  RoLP  performs  similar  to  the  LAP  with  respect  to  Image 
Entropy  and  Spatial  Frequency  (Zheng  et  al.  2004).  Both  these  methods  have  almost  Image  Quality 
Indices  which  is  a  measure  for  images  without  a  ground  truth  reference  (Zheng  et  al.  2004). 
However,  a  similar  study  was  performed  a  year  later  by  the  same  group  and  they  found  that  the 
RoLP  was  inferior  to  the  LAP,  GRAD,  and  MORPH  methods.  This  outlines  the  significant  concern 
over  the  validity  of  objective  measures  used  to  measure  the  quality  of  image  fusion.  The  RoLP 
does  receive  favourable  results  with  respect  to  image  entropy  but  drastically  inferior  to  the  LAP 
and  GRAD  with  respect  to  the  sharpness  of  the  image  (Miao  &  Wang,  2006).  The  RoLP  is  known 
to  produce  algorithm-created  spots  in  the  fused  image  that  will  affect  the  quality  of  the  image  -  See 
Figure  23  (Chen  &  Blum,  2005).  The  RoLP  may  receive  favourable  results  from  the  entropy 
measure  because  entropy  is  sensitive  to  dramatic  fluctuations  in  the  image  but  yet  cannot  discern 
the  fluctuations  to  being  from  either  noise  or  useful  information  (Chen  &  Blum,  2005).  As  you  can 
see  from  Figure  23,  the  RoLP  may  produce  unwanted  noise  in  the  fused  image. 


Figure  23:  Fused  Image  from  RoLP  Method 


4.2.4  Wavelet  Transforms  (WT) 

Wavelet  transforms  are  very  similar  to  pyramid-based  methods  meaning  that  the  transformed 
(decomposed)  images  are  combined  in  the  transform  domain  using  a  defined  fusion  rule.  Then  the 
composite  image  is  transformed  back  into  the  spatial  domain  to  give  the  resulting  fused  image 
(Hill,  Canagarajah,  and  Bull,  2002).  The  basic  idea  of  the  wavelet  transform  is  to  represent  an 
image  as  a  supeiposition  of  wavelets,  with  each  wavelet  having  an  assigned  wavelet  transform 


Mxxmmsystems  Incorporated 


Sensor  Fusion  Literature  Review 


Page  33 


HUMANVISrf.  \IS 


value.  This  is  very  similar  to  the  Fourier  transform  in  signal  processing  where  any  signal  can  be 
broken  down  into  a  series  of  sine  waves  of  different  frequencies,  with  each  frequency  having  an 
assigned  power  (contribution)  to  the  overall  signal.  When  the  wavelet  transforms  of  the  images  are 
computed  they  contain  low-high,  high-low  and  high-high  frequency  bands  of  the  image  at  the 
different  levels  while  the  low-low  band  of  the  image  is  at  its  coarsest  level.  The  low-low  band 
transform  values  are  all  positive  values  while  the  other  bands  contain  values  that  fluctuate  around 
zero.  The  larger  absolute  transform  values  represent  to  sharper  brightness  changes  that  may 
identify  edges,  lines,  and  regional  boundaries  (Li  et  al.  1995).  Similar  to  the  pyramid-based 
methods  a  selection  rule  is  put  in  place  to  select  the  transform  values  at  each  pixel  level  and  the 
inverse  wavelet  transform  will  produce  the  fused  image  based  on  the  combined  transform 
coefficients  (Li  et  al.  1995).  The  transform  coefficients  are  the  result  of  the  low  and  high-pass 
filters,  and  the  down-sampling  the  image  goes  through.  The  result  of  the  low-pass  filter  produces 
approximation  coefficients  while  the  result  of  the  high-pass  filter  produces  detail  coefficients  -  See 
Figure  24  for  a  schematic  for  the  basic  fusion  scheme  of  the  wavelet  transform. 


Registered  Wavelet  trees-  Fused  irsitstcnn  Fused  uatge 
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Figure  24:  Schematic  for  the  Basic  Wavelet  Transform  Fusion  Scheme 

The  two  most  common  versions  of  the  wavelet  transform  are  the  Discrete  Wavelet  Transform 
(DWTO,  which  yields  a  shift-variant  signal  representation,  and  the  Shift-invariant  Discrete  Wavelet 
Transform  (SiDWT),  which  combats  the  shifting  signal  representation  (Piella  &  Fleijmans,  2002). 
Wavelet  transforms  have  a  number  of  advantages  over  pyramid  based  methods.  While  pyramid 
based  methods  produce  an  over-complete  signal  representation,  the  wavelet  transform  results  in  a 
non-redundant  signal  representation  (Rockinger  &  Fechner,  1998).  This  means  that  at  different 
levels  of  a  pyramid  may  contain  the  same  information  while  at  each  levels  of  the  wavelet  transform 
the  information  is  unique. 

The  wavelet  transform  has  certain  advantages  over  pyramid-based  fusion  algorithms: 

■  The  size  of  the  wavelet  transform  is  the  same  as  the  image,  even  when  the  image 
height  and  weight  are  not  powers  of  2.  The  Laplacian  pyramid  is  4/3  the  size  of  the 
image  proving  that  the  wavelet  transform  is  more  compact; 

■  The  wavelet  transform  provides  directional  information  in  the  high-low,  low-high, 
and  high-high  frequency  bands,  while  the  pyramid-based  techniques  fail  to 
introduce  any  spatial  orientation  selectivity  into  the  decomposition  process; 

■  The  information  contained  at  different  resolution  is  unique  while  the  pyramid 
decomposition  contains  redundancy  between  different  scales;  and 
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■  In  the  Laplacian  pyramid-based  fusion,  often  the  fused  images  contain  blocking 
effects  in  the  regions  where  the  multi-sensor  data  are  significantly  different.  This 
can  be  attributed  to  the  instability  in  the  reconstruction  from  the  fused  coefficients 
when  the  two  sensor  data  differ  significantly. 

(Li  et  al.  1995) 

4.2.4. 1  Discrete  Wavelet  Transform  (DWT) 

The  Discrete  Wavelet  Transform  (DWT)  is  obtained  most  frequently  by  using  the  Mallat 
Algorithm.  In  image  processing,  the  Mallat  algorithm  constructs  a  scaling  function  and  three 
wavelet  functions.  The  scaling  function  produces  an  image  approximation  of  the  low  frequency 
information,  while  the  wavelet  functions  produce  high-low,  low-high,  and  high-high  images  that 
constitute  the  wavelet  coefficients  (Huihui  et  al.  2005). 

Using  the  Mallat  algorithm  an  input  signal  is  both  high-pass  filtered  and  low-pass  filtered  and 
down-sampled  by  a  factor  of  2.  The  result  of  the  high-pass  filter  produces  detail  coefficients  while 
the  result  of  the  low-pass  filter  produces  the  approximation  coefficients  -  See  Figure  25.  The 
approximation  coefficients  are  then  low  and  high-pass  filtered  and  down-sampled  to  produce  a  new 
set  of  approximation  and  detail  coefficients.  These  steps  continue  until  the  terminal  node  -  See 
Figure  26.  Once  the  terminal  node  is  reached,  one  method  of  fusing  the  image  is  by  choosing  the 
average  of  the  approximation  coefficients  at  the  highest  transform  scale  and  the  largest  absolute 
value  of  the  detail  coefficients  at  each  transform  scale  (Zheng  et  al.  2005).  Due  to  the  fact  that 
different  rules  are  applied  to  the  low  and  high  frequency  portions  of  the  signal  a  better  fused  image 
is  thought  to  be  the  result  (Jishuang  &  Chao,  2001).  Once  the  coefficients  are  selected  the  inverse 
of  the  Mallat  algorithm  is  performed  to  obtain  the  fused  image. 


Figure  25:  Schematic  of  Mallat  Algorithm  Decomposition  Process 
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Figure  26:  Tree  Schematic  of  Obtaining  Approximation  and  Detail  Coefficients 

using  the  Mallat  Algorithm 


The  DWT  using  the  Mallat  algorithm  does  introduce  some  problems: 

•  This  transform  is  not  shift-invariance  which  can  easily  introduce  artefacts  in  the  fusion 
process,  such  as  ringing  and  aliasing.  This  is  due  to  the  down-sampling  and  up- 
sampling  by  the  factor  of  in  the  decomposition  and  reconstruction.  Up-sampling  makes 
the  frequency-time  space  uncertain.  The  wavelet  coefficients  may  change  dramatically 
for  minor  shifts  of  the  input  signal  and  the  energy  distribution  may  change 
dramatically  as  well  at  the  different  resolution  levels,  which  may  distort  the  result  of 
the  reconstruction. 

•  Pixel  by  pixel  analysis  is  not  possible  since  data  is  reduced  at  each  resolution;  it  is  then 
not  possible  to  follow  the  evolution  of  a  dominant  feature  through  the  different  levels. 

•  The  images  are  decomposed  with  sizes  that  are  powers  of  two;  because  the  resolution 
is  reduced  by  two  at  each  level  it  is  not  possible  to  fuse  images  of  any  sizes. 

(Huihui  et  al.  2005) 

The  main  disadvantage  of  the  DWT  is  it  shift-invariance.  Despite  these  problems  the  DWT  still 
performs  favourably  when  compared  to  other  fusion  methods.  According  to  observers  in  the  Chen 
and  Blum  study  (2005),  the  Shift-Invariant  Discrete  Wavelet  Transform  and  the  LAP  generally 
outperformed  all  other  fusion  methods  while  the  DWT  and  GRAD  methods  were  the  next  best 
methods.  These  results  were  similar  in  the  quantitative  measures  of  this  study  where  the  DWT 
performed  similar  to  the  GRAD  pyramid  method  but  not  as  well  as  the  LAP  and  the  Shift-Invariant 
Discrete  Wavelet  Transform  (Chen  &  Blum,  2005).  These  results  were  similar  to  that  of  Zheng  et 
al.  (2005)  where  the  DWT  performed  similar  to  the  GRAD  method  but  not  as  well  as  the  LAP 
when  fusing  night  vision  imagery.  Even  though  the  DWT  is  more  compact  and  contains  non- 
redundant  information  it  still  has  a  major  drawback  of  not  being  shift-invariant. 
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4. 2. 4. 2  Shift-Invariant  Discrete  Wavelet  Transform  (SiDWT) 

The  Shift-Invariant  Discrete  Wavelet  Transform  (SiDWT)  is  an  extension  to  the  DWT  but  uses 
different  algorithms  and  approaches  to  improve  the  temporal  stability  and  consistency  of  the  fused 
image  to  yield  better  results.  There  are  several  ways  to  produce  a  shift-invariant  version  of  the 
DWT.  One  simple  way  is  to  eliminate  the  down-sampling  feature  of  the  DWT.  This  is  very 
inefficient  method  of  creating  a  SiDWT  but  it  is  simple  and  effective  (Chari  et  al.  2005). 

Another  common  method  of  producing  a  SiDWT  is  by  using  the  a  trous  algorithm.  The  a  trous 
algorithm  is  an  undecimated  dyadic  wavelet  transform  that  is  suitable  for  signal  and  image 
processing  because  it  is  isotropic  and  does  not  introduce  any  artefacts  (Wang,  Ziou,  Armenakis,  Li, 
and  Li,  2005).  The  a  trous  uses  a  different  formula  and  avoids  the  use  of  the  down-sampling  and 
up-sampling  procedures  as  compared  to  the  Mallat  algorithm  to  obtain  its  coefficients.  However, 
the  coefficients  still  contain  magnitude  information  and  the  importance  of  the  local  features 
(Huihui  et  al.  2005).  The  a  trous  algorithm  has  several  advantages  over  the  DWT: 

•  Wavelet  and  approximation  planes  have  the  same  dimensions  as  the  original  image, 
which  avoids  the  introduction  of  artefacts  because  of  the  lack  of  up-sampling  and 
down-sampling; 

•  Unlike  the  DWT,  there  is  redundancy  of  information  at  each  scale,  allowing  the  better 
detection  of  a  dominant  feature;  and 

•  Can  be  applied  to  any  sized  images  for  fusion. 

(Huihui  et  al.  2005). 

The  advantages  of  the  a  trous  algorithm  come  at  the  cost  of  computational  and  time  demands 
(Huihui  et  al.  2005).  Other  algorithms  used  to  produce  a  SiDWT  are  the  Haar  wavelet  and  the 
Daubechies  wavelet. 

Obtaining  the  fused  image  by  the  inverse  SiDWT  is  similar  to  the  DWT  except  for  the  omission  of 
the  up-sampling  step  due  to  the  removal  of  the  down-sampling  step  in  the  decomposition  phase 
(Chari  et  al.  2005). 

For  image  fusion  purposes,  the  SiDWT  generally  outperforms  all  other  methods  previously 
discussed.  In  terms  of  subjective  results  the  SiDWT  and  the  LAP  outperformed  all  other  methods 
and  placed  in  the  top  3  of  7  objective  measures  4  times  which  also  ranked  the  highest  along  with 
the  LAP  method  (Chen  &  Blum,  2005).  In  terms  of  sharpness  and  entropy,  SiDWT  performs  better 
than  most  of  the  pyramid  based  schemes  except  for  the  LAP  (Qiguang  &  Boashu,  2006).  SiDWT 
outperformed  all  DWT  based  methods  when  a  ground  truth  image  was  obtained  using  a  cut  and 
paste  method  (Hill  et  al.  2002).  These  results  demonstrate  the  importance  of  shift-invariance  in 
wavelet  transform  fusion  in  terms  of  producing  clear  fused  images  free  of  any  additional  artefacts. 

4.2.5  Feature  Level  Image  Fusion 

Feature  level  methods  are  the  next  stage  of  processing  where  image  fusion  may  take  place.  Fusion 
at  the  feature  level  requires  extraction  of  objects  (features)  from  the  input  images  (Pohl  &  Van 
Genderen  1998).  These  features  are  then  are  then  combined  with  the  similar  features  present  in  the 
other  input  images  through  a  pre-determined  selection  process  to  form  the  final  fused  image.  Since, 
one  of  the  essential  goals  of  fusion  is  to  preserve  the  image  features,  feature  level  methods  have  the 
ability  to  yield  subjectively  better  fused  images  than  pixel  based  techniques  (Samadzadegan,  2004). 
A  schematic  of  feature  level  fusion  is  shown  in  Figure  27  adapted  from  Samadzadegan,  2004. 
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Common  algorithms  that  fuse  images  at  the  feature  level  include  edge  detection  methods  and 
artificial  neural  networks.  For  our  purposes  only  the  edge  detection  method  will  be  discussed  in 
greater  detail. 
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Figure  27:  Schematic  of  Feature  Level  Fusion 


4. 2. 5. 1  Edge  Detection  Method 

The  goal  of  the  edge  detection  method  is  to  identify  changes,  based  on  contrast,  of  the  input  images 
that  are  likely  to  identify  important  events  or  targets  in  the  real  world  image.  Since  the  edges  of 
objects  present  in  the  images  are  most  likely  to  display  these  changes  in  image  intensity  and  are 
preserved  in  most  cases,  contours  are  often  used  to  fuse  images  from  different  sensors  (Liu,  Zhou, 
and  Wang,  2006).  Edge  Detection  Methods  use  specific  filters  to  extract  edge  information  of  each 
band  (Lanir,  2005).  Depending  on  the  filter  used,  specific  features  such  as  lineament,  edge,  texture, 
and  gray  degree  will  be  segmented  from  the  input  image  (Rui  &  Ming,  2006).  Edge  detection 
methods  can  fuse  images  by  first  selecting  an  input  image  as  the  base  image  and  then  overlaying 
the  extracted  features  onto  the  base  image  (Lanir,  2005).  Another  method  is  to  calculate  specific 
features  from  all  input  images  and  then  use  a  pre-determined  selection  rule  to  extract  certain 
features  from  certain  input  images  to  obtain  the  final  fused  image.  Since  edge  detection  methods  do 
not  work  on  the  pixel  by  pixel  basis  like  the  previous  algorithms,  features  are  calculated  by  using 
windows  of  pixels  which  contain  the  pixel  of  interest  and  its  neighbouring  pixels  (Kwon  Der,  and 
Nasrabadi,  2002).  Specific  features  such  as,  local-maximum  gray  level,  local  contrast,  local- 
average  gradient  strength,  and  local  variation  have  been  used  in  the  past  (Kwon  et  al.  2002).  This 
type  of  edge  detection  method  is  known  as  a  search-based  method  where  edges  are  detected  based 
on  search  criteria  that  look  for  maxima  and  minima  values  (Siddique  &  Bamer,  2002).  Due  to  the 
fact  that  changes  in  intensity  values  of  a  object  is  likely  to  occur  over  a  number  of  pixels,  edge 
detection  algorithms  usually  take  the  1st  derivative  of  the  input  image  in  order  to  measure  the  where 
the  change  in  pixel  intensity  is  the  highest.  Other  methods  that  are  zero-crossing  based,  take  the  2nd 
derivative  of  the  input  image  where  the  point  at  which  it  crosses  zero  indicates  where  the  rate  of 
change  in  pixel  intensity  is  the  highest  (Siddique  &  Barner,  2002). 

An  important  factor  in  edge  detection  is  applying  the  right  threshold  to  the  derivative  function 
where  you  believe  that  an  edge  will  be  present.  Selecting  a  threshold  to  small  will  identify  many 
edges  while  a  threshold  to  large  may  miss  some  important  edges.  Another  critical  step  in  image 
fusion  using  the  edge  detection  method  is  to  select  the  appropriate  size  of  window  where  the 
specific  features  are  extracted.  If  the  size  of  the  window  is  too  small  then  they  will  be  a  lot  of 
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ambiguity  and  the  target  will  not  be  properly  extracted  and  if  the  window  is  too  large  then  there 
will  not  be  enough  overlap  to  enhance  the  identification  of  the  target  (Lanir,  2005). 

The  search-based  and  zero-crossing  based  edge  detection  methods  are  more  classical  methods  and 
are  single  resolution.  However,  significant  intensity  changes  can  occur  at  different  resolution 
levels.  Therefore,  single  resolution  is  unlikely  to  be  sufficient  in  many  applications  (Siddique  & 
Bamer,  2002). 

There  are  several  Multi-resolution  (MR)  edge  detection  methods  that  work  very  similar  to  the 
pyramid  and  wavelet  methods  described  earlier.  A  MR  edge  detection  method  will  generate  a  series 
of  progressively  lower  resolution  images  with  fewer  details  (Siddique  &  Bamer,  2002).  All  MR 
edge  detection  methods  can  be  classified  as  either  linear  or  non-linear.  Linear  methods  are  likely  to 
blur  important  image  features  at  each  decomposition  level  while  failing  to  remove  small  scale 
detail  (Siddique  &  Bamer,  2002).  Non-linear  methods  have  the  ability  to  preserve  large-scale  edges 
while  completely  removing  structures  smaller  than  a  specified  window  size  (Siddique  &  Bamer, 
2002).  Choosing  a  window  size  that  is  too  small  is  susceptible  to  noise  contamination  while 
choosing  a  window  size  that  is  large  may  be  robust  to  noise  contamination  it  may  not  detect  finer 
details. 

The  Department  of  Electrical  Engineering  and  Computer  Science  from  Lehigh  University  describes 
a  MR  method  incorporating  edge  detection  and  wavelet  coefficients.  Firstly,  an  edge  detection 
algorithm  is  applied  to  the  low-low  bands  at  each  wavelet  level.  The  results  of  the  edge  images 
provide  information  on  the  location  and  intensity  of  edges  in  the  source  images.  Using  the  edge 
information  the  source  images  are  segmented  into  regions  with  each  region  obtaining  a  certain 
activity  level  based  on  the  average  of  the  high-frequency  wavelet  coefficients.  The  larger  the 
activity  level  in  a  region  indicates  the  more  informative  the  region  is.  Once  the  activity  levels  of  the 
regions  are  obtained  specified  fusion  rules  are  applied  to  obtain  the  final  fused  image  Examples  of 
the  fusion  rules  are: 

■  High  activity  regions  are  preferred  over  low  activity  level  regions; 

■  Edge  points  are  preferred  over  non-edge  points; 

■  Small  regions  preferred  over  large  regions; 

■  Avoid  isolated  points  in  decision  map. 

Based  on  the  pre-selected  rules  the  fused  wavelet  coefficient  image  is  obtained  and  by  the  inverse 
wavelet  transform  the  final  image  is  obtained. 

Edge  detection  methods  can  be  beneficial  but  the  threshold  level  and  window  size  needs  to  be 
tailored  to  each  individual  application.  This  may  not  be  suitable  for  all  applications.  The  edge 
detection  method  did  not  perform  well  compared  to  the  PCA  and  the  simple  averaging  technique  in 
target  detection  tasks.  It  had  significantly  less  detection  rates  and  rated  poor  in  the  subjective 
opinion  of  the  observers  (Lanir,  2005). 

4.2.6  Decision  Level  Image  Fusion 

Decision  Level  methods  are  at  the  highest  level  of  processing  where  image  fusion  can  take  place. 
Fusion  at  the  Decision  Level  takes  Feature  Level  fusion  one  step  further  by  declaring  identities  to 
the  objects  recognized,  by  the  individual  input  images,  and  then  assigning  a  quality  measure  to  the 
extracted  features  -  See  Figure  28.  The  obtained  information  is  then  combined  by  applying  decision 
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rules  to  reinforce  common  interpretation  and  resolve  differences  of  the  observed  objects  (Pohl  & 
Van  Genderen  1998). 
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Figure  28:  Schematic  of  Decision  Level  Fusion  (adapted  from  Sarmadzadegan, 

2004) 

Due  to  fact  that  decision  level  fusion  methods  rely  on  the  object  recognition  by  all  sensors  in  order 
to  produce  a  valid  representation  of  the  input  images,  if  an  object  is  not  recognized  by  all  the 
sensors  (via  input  images)  then  the  output  image  will  not  utilize  the  full  benefits  of  image  fusion 
(Gunatilaka  &  Baertlein,  2001).  Decision  level  fusion  also  creates  another  source  of  possible  error 
when  compared  to  the  other  fusion  levels.  If  there  is  an  error  in  recognition  of  objects  from  one  of 
the  sensors  this  error  will  be  transferred  to  the  output  fused  image.  Some  common  algorithms  used 
in  decision  level  fusion  include  Dempster-Shafer  Theory,  Fuzzy  Logic,  Rule-based  Fusion,  and 
Bayesian  Networks.  Based  on  the  high  computational  demands  and  the  drawback  of  every  sensor 
needed  to  recognize  the  objects  in  an  image  to  provide  a  valid  representation  of  the  scene  no 
decision  level  fusion  methods  will  be  discussed  in  detail. 


4.3  Evaluation  Metrics 

A  preliminary  search  was  conducted  to  identify  fusion  articles  which  contained  references  to  image 
fusion  performance,  image  fusion  evaluation  or  image  fusion  measurement.  This  initial  search 
identified  approximately  7,640,000  articles!  The  search  was  refined  to  identify  specific  articles 
that  only  included  one  modifier  (performance,  evaluation  or  measurement).  Image  fusion  and 
performance  resulted  in  1,560,000  hits;  image  fusion  and  evaluation  resulted  in  1,310,000  hits  and 
image  fusion  measurement  resulted  in  1,160,000  articles.  Using  the  term  “metrics”  and  “image 
fusion”  resulted  in  the  detection  of  974,000  articles. 

A  refined  literature  search  literature  search  was  then  conducted  to  identify  articles  which  contained 
the  exact  key  phrase  terms.  This  search  identified  a  total  of  88  articles  that  included  the  exact  term 
“image  fusion  evaluation”  and  a  total  of  462  reports  that  included  the  exact  term  ‘image  fusion 
measurement”.  In  an  effort  to  refine  the  search,  the  key  phrases  were  modified  with  the  terms 
“objective”  and  “subjective”.  Combining  the  term  “objective”  with  “image  fusion  evaluation” 
yielded  only  four  results  while  “subjective  image  fusion  evaluation”  yielded  seven  results.  Adding 
the  term  subjective  or  objective  to  “image  fusion  measurement”  yielded  zero  results.  A  refined 
literature  search  literature  search  was  also  conducted  to  identify  articles  which  contained  the  exact 
key  phrase  “image  fusion  analysis”.  This  search  identified  a  total  of  121  articles. 
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A  review  of  the  search  results  identified  a  number  of  medical  references  pertaining  to  the  analysis 
of  medical  images,  i.e.  Image  fusion  analysis  of  [99m]Tc-HYNIC-octreotide  scintigraphy  and 
CT/MRI  in  patients  with  thyroid-associated  orbitopathy:  and  the  importance  of  the  lacrimal  gland 
(Kainz,  Bale,  Donnemiller,  Gabriel,  Kovacs,  Decristoforo,  and  Moncayo,  2003).  Refining  the 
search  to  eliminate  specific  medical  applications  only  reduced  the  number  of  articles  marginally. 

Based  on  a  review  of  article  titles  and  abstracts  approximately  40  articles  were  selected  and 
obtained  for  detailed  examination.  A  review  of  these  articles  identified  additional  references  for 
review.  Approximately  70  articles  were  retrieved  and  reviewed  during  this  study. 

A  discussion  on  the  results  of  the  literature  review  on  image  fusion  metrics  is  detailed  below.  The 
image  fusion  metric  results  are  organized  into  the  following  sections:  introduction,  subjective 
evaluation  approaches,  objective  evaluation  approaches  and  suggestions  on  how  the  SIHS  TD 
should  investigate  fusion  performance. 

4.3.1  Evaluation  Criteria 

The  ultimate  aim  of  image  fusion  is  to  create  a  faithful  and  composite  image  that  retains  the 
important  information  from  the  source  images  while  minimizing  the  noise  caused  by  fusing  the 
images.  For  the  SIHS  application,  these  images  will  be  typically  viewed  and  interpreted 
(perceived)  by  an  operator.  A  number  of  evaluation  approaches  and  metrics  have  been  proposed  to 
quantify  and  qualify  image  fusion  performance: 

•  The  fusion  measure  must  be  able  to  identify  and  localize  visual  information  in  the  input 
and  fused  images  (Petrovic  &  Xydes  2005). 

•  The  fusion  process  should  preserve  all  relevant  information  of  the  input  imagery  in  the 
composite  image  (Petrovic  &  Xydes  2005).  Conversely  the  fusion  metric  must  be  able  to 
identify  losses  in  relevant  information. 

•  The  fusion  scheme  should  not  introduce  any  artefacts  or  inconsistencies  that  would  distract 
the  human  observer  or  disrupt  subsequent  processing  stages  (Petrovic  &  Xydes  2005). 
Conversely  the  fusion  metric  must  be  able  to  identify  artefacts  or  inconsistencies  added  to 
the  fusion  image. 

•  The  fusion  measure  must  be  able  to  evaluate  perceptual  importance  (Petrovic  &  Xydes 
2005). 

•  The  fusion  measure  must  be  able  to  measure  the  accuracy  with  which  input  information  is 
represented  in  the  fused  image  (Petrovic  &  Xydes  2005). 

•  The  fusion  measure  must  be  able  to  distinguish  between  true  scene  information  and 
artefacts  caused  by  the  fusion  process  (Petrovic  &  Xydes  2005). 

•  For  video  sequences,  the  fusion  measure  must  have  temporal  consistency,  in  that  the  gray 
level  changes  in  the  input  sequence  must  be  present  in  the  fused  sequence  without  delay  or 
contrast  (Rockinger  &  Fechner,  1998).  Temporal  inconsistencies  of  fusion  systems  can 
arise  due  to  asynchronous  operation  of  the  sensors  (i.e.  cameras  may  not  be  capturing 
images  at  the  same  rate  or  at  consistent  intervals). 

•  The  fusion  measure  must  have  temporal  stability.  In  that  test  and  retest  results  must  be 
comparable  (Rockinger  &  Fechner,  1998) 
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•  The  fusion  process  should  be  shift  and  rotation  invariant.  The  fusion  results  should  not 
depend  on  the  location  or  orientation  of  the  object.  (Rockinger  &  Fechner,  1998) 

4.3.2  Subjective  Evaluation  Approaches 

Fusion  performance  has  been  investigated  using  subjective  and  objective  approaches.  Since  human 
perception  of  the  composite  image  is  of  paramount  importance,  a  number  of  investigators  have 
used  subjective  or  human  in  the  loop  evaluation  methods.  The  subjective  approaches  use  well 
established  scientific  methods  adapted  for  video  and  still  image  quality  assessment  (Wang,  Sheik 
and  Bovik,  2003).  Subjective  evaluation  approaches  were  utilized  in  the  image  assessment  of 
compression  algorithms,  noise  reduction,  image  quality,  etc.  Image  quality  in  itself  does  not  reveal 
how  well  human  performance  will  be  affected,  i.e.  a  subject  may  perform  best  with  a  lower 
resolution  but  clean  image  versus  a  higher  resolution  but  noisy  image  (Wang,  et  al). 

Two  basic  subjective  evaluation  approaches  were  noted  in  the  literature,  active  or  task  related 
(quantitative)  and  descriptive  (qualitative).  Quantitative  approaches  were  utilized  by  Toet  and 
Ijspeert  (2001),  where  subjects  assessed  different  fusion  approaches  on  target  detection  and 
recognition,  as  well  as  subject  perception  of  situational  awareness.  Ryan  and  Tinkler  (1995) 
evaluated  the  potential  advantage  of  fusion  for  helicopter  pilots  in  real  and  simulated  flight.  Dixon, 
Canga,  Noyes,  Troscianko,  Nikolo  and  Bull  (2006)  completed  a  target  detection  task  evaluating 
different  fusion  algorithms  and  image  compression  methods. 

In  addition  to  quantitative  subjective  tests,  a  large  number  of  qualitative  evaluations  have  been 
undertaken  to  rate  or  rank  the  quality  of  fusion  images.  Lanir  (2005)  evaluated  both  target 
detection  performance  and  fused  image  quality  generated  from  four  fusion  approaches.  Petrovic 
and  Xydeas  (2005)  ranked  subject  preference  for  eight  different  fusion  schemes.  Subjective 
evaluation  approaches  will  be  described  in  greater  detail  the  following  section. 

While  subjective  evaluations  are  the  most  reliable,  credible  and  direct  method  to  evaluate  fusion 
performance  they  are  difficult  to  control,  expensive  and  time  consuming.  Concerns  raised  in  the 
literature  with  subjective  fusion  evaluations  are  summarized  below: 

•  Results  are  task  (detection,  recognition,  identification,  and  situational  awareness)  and 
environment  dependent; 

•  Results  vary  according  to  target  characteristics; 

•  Results  are  complicated  by  observer  vision  ability  (acuity,  contrast  sensitivity,  colour 
deficiency,  etc.); 

•  The  size  and  composition  of  the  test  audience  (novice  versus  expert)  affects  the  results; 

•  Maturation  by  the  subjects  -  processes  within  the  participants  as  a  function  of  the  passage 
of  time  (not  specific  to  particular  events),  e.g.,  growing  older,  hungrier,  more  tired,  and  so 
on; 

•  Inherent  personality  preferences  may  affect  results,  i.e.  sensing  people,  in  accordance  with 
the  Myers-Briggs  Type  Indicator  (MBTI),  tend  to  focus  on  the  present  and  on  concrete 
information  gained  from  their  senses  while  intuitives  tend  to  focus  on  the  future,  with  a 
view  toward  patterns  and  possibilities.  These  people  prefer  to  receive  data  from  the 
subconscious,  or  seeing  relationships  via  insights 

•  Reliability  within  subject  assessments  is  poor,  i.e.  there  is  a  need  to  realign  individual 
scales  if  multiple  sessions  are  required; 

•  Repeatability  in  experiments  is  difficult  due  to  variations  in  image  sources; 

o  Ambiguous  light  levels; 
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o  Signal  intensities; 
o  Contrast; 
o  Noise; 

o  Intrinsic  image  characteristics  (i.e.  gray  scale  images  vs.  colour); 

•  Differences  in  field  of  view; 

•  Differences  in  refresh  rates;  and 

•  Differences  in  display  performance. 

4.3.2. 1  Quantitative  Subjective  Evaluation 

Quantitative  fusion  assessment  has  focused  on  the  target  detection,  recognition  and  situational 
awareness.  Target  detection  and  recognition  assessment  has  been  assessed  in  naturalistic  and  in 
laboratory  settings.  By  their  nature,  real  time  assessments  are  difficult  to  duplicate,  instead  most 
fusion  assessment  experiments  have  focused  on  the  capture  of  still  or  live  video  of  targets  in 
operational  settings.  The  fusion  community  has  captured  and  shared  a  number  of  multi-spectra 
reference  images  for  algorithm  development  and  assessment.  These  images  are  then  used  in 
psychophysical  testing  in  a  laboratory  setting.  Patches  displaying  just  the  target  in  question 
(Essock,  Sinai,  McCarley,  Krebs  and  DeFord,  1999)  and  full  screen  images  -  see  Figure  30  have 
been  used  in  psychophysical  tests.  Short  clips  of  video  images  have  been  used  in  fusion 
assessment.  Video  length  reported  in  the  literature  varied  between  100  frames  (Rockinger  & 
Fechner,  1998)  and  complete  missions  (Ryan  &  Tinkler,  1995).  In  addition  to  the  assessment  of 
fusion  performance  through  live  video,  individual  video  frames  or  stills  have  been  used  to  assess 
objective  fusion  performance.  The  objective  performance  of  the  fusion  system  is  determined  by 
averaging  individual  frame  results  across  the  entire  clip. 


Figure  29:  Video  sample  -  low  light  TV  image  (left);  forward  looking  infrared  image 
(right)  from  Rockinger  and  Fechner  (1998) 
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Figure  30:  Image  samples-  smokescreen  penetration  and  target  pop-out  is  achieved 
through  the  color  fusion  of  visible  CCD  and  FLIR  imagery  in  this  daytime  scene. 

Figure  30shows  imagery  provided  through  the  Canadian  Defence  Research  Establishment, 
Valcartier,  Quebec,  as  part  of  a  NATO  study  (a)  Intensified  visible  image;  (b)  Thermal  IR  (FLIR) 
image;  (c)  Gray  fused  image;  and  (d)  Color  fused  image.  (Waxman,  Aguilar,  Fay,  Ireland, 
Racamato,  Ross,  Carrick,  Gove,  Seibert,  Savoye,  Reich,  Burke,  McGonagle,  and  Craig,  1998) 

Imagery  used  for  target  detection  or  scene  detail  assessment  have  included  personnel,  vehicles, 
aircraft,  buildings,  and  ground  features  (water,  roads,  trails,  etc.).  Unfortunately  except  for  a  few 
studies  environmental  conditions,  lumination,  target  detection  distances,  temperature  and  target 
sizes  or  temperature  differences  (where  appropriate)  were  not  described  in  sufficient  detail  in  the 
literature  reviewed. 

4.3.2.1.1  Target  Detection,  Recognition  and  Identification  Assessment 

Experiments  were  performed  by  investigators  that  assessed  the  perception  of  “details”  or  the 
“detection”  of  object  classes  in  an  image.  Toet  and  Franken  (2003)  assessed  individual 
performance  in  their  ability  to  detect  the  following  classes  of  objects  through  a  variety  of  fusion 
approaches: 

•  Building; 

•  Person; 

•  Road  or  path; 

•  Fluid  water  (e.g.  a  ditch,  a  lake,  a  pond;  or 

•  Vehicle  (e.g.  a  truck,  car  or  van). 
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Figure  31:  Personnel  and  path  target  appearance  in  two  colour  fusion  approaches 

from  Toet  and  Franken,  2003. 

A  number  of  military  investigators  have  investigated  the  performance  of  fusion  systems  for 
classical  target  detection,  recognition  and  identification.  Given  the  sensitive  nature  of  the  results, 
these  reports  are  not  available  in  the  open  literature.  One  of  the  authors  (Angel  &  Massel,  2005) 
investigated  the  performance  of  two  optically  fused  Enhanced  Night  Vision  Goggles  (ENVGs)  in  a 
naturalistic  setting.  Target  detection,  recognition  and  identification  performance  was  evaluated 
using  friendly  and  enemy  targets.  Differences  in  fusion  performance  were  observed  between  the 
two  systems. 

A  number  of  other  investigators  have  evaluated  simple  target  detection  performance  of  fusion 
systems  in  laboratory  environments  -  see  Essock,  McCarley,  Krebs  &  DeFord  (1999)  and 
Rockinger  &  Fechner  (1998).  Images  were  presented  to  observers  and  the  time  to  detect  and  the 
accuracy  of  detection  was  recorded.  Images  with  and  without  targets  were  sequentially  presented 
to  an  observer  using  a  computer  display  in  a  controlled  environment.  The  number  of  images 
presented  to  an  observer  varied  according  to  the  test  but  sessions  were  typically  restricted  to  30 
minutes  due  to  fatigue  concerns.  Experiments  investigating  detection,  recognition  or  detail 
assessments  typically  began  with  subject  training  and  a  number  of  practice  trials.  During 
familiarization  training  subjects  are  first  shown  how  targets  actually  appear  the  different  fusion 
approaches  -  see  Figure  32. 
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Figure  32:  Sample  image  identifying  detection  targets  from  Lanir  (2005) 

The  simple  detection  test  protocol  followed  a  typical  signal  detection  procedure.  The  stimuli 
(images)  were  presented  onto  a  computer  screen  in  a  dimly  lit  room.  Random  noise  was  first 
presented  to  the  subject  for  brief  period  of  time  (Toet  &  Franken,  (2003)  utilized  400m),  followed 
by  the  presentation  of  the  image  stimulus  for  another  brief  period  of  time  (500ms).  The  sizes  of  the 
objects  in  the  stimulus  image  were  controlled.  Subjects  were  required  to  indicate  as  rapidly  as 
possible  the  presence  or  absence  of  the  different  target  classes.  The  brief  target  exposure  and  target 
size  served  to  prevent  scanning  eye  movements  and  to  force  observers  to  make  a  decision  based 
solely  on  the  stimulus  presented.  Toet  and  Franken  (2003)  state  that  scanning  behaviour  may  differ 
among  image  modalities  and  target  types.  An  energy  mask  then  followed  which  helped  to  erase 
any  possible  after  images  and  equalled  between  subjects  processing  time.  The  noise  image  was 
controlled  for  image  size  (same  as  the  target)  and  colour.  The  accuracy  results  were  evaluated 
using  the  signal  detection  theory  discriminability  index  -d'.  The  discriminability  index  uses  hit  and 
miss  rates  and  a  Receiver  Operating  Characteristic  (ROC)  curve  -  see  Figure  33. 
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4.3.2.1.2  Situational  Awareness  Assessment 

Situation  Awareness  (SA),  as  defined  by  Endsley  (1995  pg  36)  “is  the  perception  of  the  elements  in 
the  environment  within  a  volume  of  time  and  space,  the  comprehension  of  their  meaning,  and  the 
projection  of  their  status”.  Endsley  (1995)  describes  SA  as  having  three  levels  or  phases: 

•  Level  1  is  the  perception  of  elements  in  the  environment; 

•  Level  2  is  the  comprehension  of  the  current  situation;  and 

•  Level  3  is  the  projection  of  future  status. 

Level  2  SA  not  only  includes  the  perception  of  elements  found  in  Level  1  SA,  but  also  the 
understanding  of  the  significance  of  the  elements  in  light  of  the  operator’s  goal.  According  to 
Endsley,  (1995  pg  37)-  in  Level  2  SA,  the  operator  forms  a  holistic  picture  of  the  environment. 

Toet  and  Lranken  (2003)  assessed  situational  awareness  by  asking  subjects  to  report  the  relative 
location  of  personnel  relative  to  a  fixed  landmark.  Three  sets  of  SA  evaluation  images  were 
developed  and  based  upon  operational  scenarios  developed  for  the  Royal  Netherland  Army.  The 
scenarios  included  the  following: 

•  Guarding  a  UN  camp; 

•  Guarding  a  temporary  base;  and 

•  Surveillance  of  a  large  area. 

Lor  the  Guarding  a  UN  camp  Toet  and  Lranken  (2003)  asked  subjects  to  identify  the  position  of  the 
human  target  relative  to  a  fence  line:  See  figure  34.  Targets  were  located  on  the  left,  center  and 
right  of  the  fence  apex.  In  the  guarding  the  temporary  base  the  test  was  to  determine  the  location 
of  the  person  relative  to  a  tree  line  and  finally  the  surveillance  test  include  the  determination  of  a 
target  relative  to  a  path.  As  with  their  detail  detection  exercise,  images  were  gathered  by  Toet  and 
Lranken  (2003)  for  laboratory  based  psychophysical  testing. 


Figure  34:  Sample  human  target  from  UN  camp  SA  assessment  (Toet,  2002) 

Angel  and  Vilhena  (2005)  investigated  the  performance  of  two  optically  fused  Enhanced  Night 
Vision  Goggles  (ENVGs)  and  dedicated  thermal  and  image  intensified  camera  systems  for  scout 
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and  sentry  performance  in  a  naturalistic  setting.  Fused  systems  were  reported  to  have  performed 
better  than  stand  alone  LWIR  or  Image  Intensified  systems. 

4.3.2.1.3  Spatial  Orientation 

The  effects  of  colour  fusion  on  global  situational  awareness  has  been  investigated  by  Krebs  and 
Sinai  (2002)  and  Toet  and  Franken  (2003).  These  investigations  centered  on  the  evaluation  of 
scene  orientation,  i.e.  the  perception  of  the  global  scene  rather  than  local  demands.  The  ability  to 
accurately  perceive  horizons  and  water  features  is  difficult  with  monochrome  fusion  images  and 
individual  sensors.  Both  investigations  manipulated  the  presentation  of  scenes,  i.e.  either  a  scene 
was  upright  or  inverted. 

While  Krebs  and  Sinai  (2002)  manipulated  image  orientation  alone,  Toet  and  Franken  (2003)  also 
assessed  the  observer’s  ability  to  perceive  the  horizon.  Both  investigations  reported  that  Infra-Red 
(IR)  sensors  alone  performed  the  poorest  at  the  perception  of  whether  the  image  was  upright.  Toet 
and  Franken  (2003)  also  reported  that  IR  sensors  also  performed  the  poorest  at  detecting  the 
location  of  the  horizon  -  see  columns  one  and  two  in  Figure  35.  Both  investigations  found  that 
image  intensified  sensors  performed  the  best  for  spatial  orientation. 

While  fusion  did  not  improve  spatial  orientation  in  Toet  and  Franken  (2003),  fusion  did  improve 
the  detection  of  terrain  features  and  targets  in  the  global  scene. 


Figure  35:  Situational  awareness  results  from  Toet  and  Franken  (2003). 

(Note:  infrared  (IR),  single-band  or  gray  scale  (II)  and  double-band  or  colour  (DII)  intensified 
visual,  gray  scale  fused  (GF)  and  colour  fused  (CF1,  CF2)  imagery.) 

4.3. 2.2  Qualitative  Subjective  Fusion  Assessment 

A  variety  of  scales  and  methods  have  been  used  to  evaluate  the  quality  of  fusion  images,  typically  a 
subject  is  asked  to  rank  or  rate  the  quality  of  the  image  on  a  linear  or  ordinal  scale.  Three 
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approaches  are  discussed  in  the  literature,  simple  ranking,  Single  Stimulus  Continuous  Quality 
Evaluation  (SSCQE)  and  Double  Stimulus  Continuous  Quality  Evaluation  (DSCQE). 

Simple  ranking  involves  the  assessment  of  the  fusion  images  by  panel  that  individually  ranks  the 
images  into  their  perceived  order  of  image  quality.  While  Chen  and  Blum  (2000)  utilized  a  “small 
evaluation  group”  to  assess  the  fusion  image  quality,  Petrovic,  (2007)  used  100  subjects  to  evaluate 
nine  distinct  fusion  approaches. 

SSCQE,  subjects  are  asked  to  asses  the  quality  of  a  fusion  image  using  a  linear  scale.  Descriptive 
anchors  (bad,  poor,  fair,  good  and  excellent)  have  been  used  on  a  1-100  continuous  scale.  The 
results  of  the  image  assessment  are  ranked  to  determine  the  relative  quality  of  the  images  and  thus 
the  fusion  approach  performance.  Other  investigations  have  simply  used  vision  or  operational 
experts  to  simply  rank  the  qualities  of  the  fusion  images). 

SSCQE  requires  familiarization  training  for  subjects  to  develop  a  quality  frame  of  reference  for  the 
images  they  are  about  to  assess.  If  evaluations  require  multiple  sessions  then  scale  realignment 
techniques  are  required  (i.e.  an  image  that  rated  50  on  session  one  should  be  similar  in  value  on 
subsequent  days).  Sessions  typically  involve  the  sequential  presentation  of  images  to  a  reviewer 
who  scores  the  quality  on  a  scale. 


DSCQE  (or  paired  comparison)  involves  the  comparative  evaluation  of  pairs  of  fusion  images 
(same  image  but  different  fusion  approach)  -  see  Figure  36.  The  approach  utilizes  the  Law  of 
Comparative  Judgment,  where  the  percentage  of  the  time  one  fusion  approach  is  preferred  over 
another  is  used  as  an  index  of  the  relative  quality  of  the  fusion  approaches.  Pairwise  comparisons 
forms  the  basis  of  Thurstone  Scaling  and  the  approach  has  been  found  to  generate  reliable  data 
about  the  relative  subjective  quality  of  entities. 


Figure  36:  Paired  comparison  sample  from  Lanir  (2005) 

The  Thurston  method  requires  respondents  to  compare  two  options  that  must  be  evaluated.  They 
are  asked  to  choose  only  one  selection  (i.e.,  which  fusion  image  is  better?).  Z-values  are  then 
determined  based  on  a  normal  distribution  based  on  the  "winning  percentage"  of  these 
comparisons.  The  results  are  then  converted  to  a  one-dimensional  numbered  scale  using  an 
arbitrary  reference  point.  The  key  feature  in  Thurstone  scaling  is  that  in  addition  to  being  able  to 
determine  the  ranking  order  of  images,  the  distance  and  spread  between  respective  options  is 
determined.  While  the  paired  comparison  method  is  an  excellent  technique  for  evaluating 
differences  between  fusion  images,  the  number  of  comparisons  can  become  excessive  as  more 
fusion  approaches  (n)  are  utilized. 
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In  summary,  ranking  and  Mean  Opinion  Score  (MOS)  techniques  such  as  SSCQE  and  DSCQE 
have  been  used  by  researchers  to  evaluate  fusion  image  quality.  The  approach  allows  subjective 
evaluation  of  the  performance  of  different  fusion  algorithms.  Subjective  evaluation  approaches 
have  typically  utilized  still  images  in  their  examination  of  fusion  performance;  only  two  studies 
reviewed  utilized  video. 

4.3.3  Objective  Evaluation  Approaches 

Objective  measures  utilize  input  images  and  the  fusion  image  to  develop  a  numerical  score  of  the 
success  of  the  fusion  process  (Petrovic,  2007).  And  unlike  subjective  assessments  which  have 
significant  organizational  and  logistic  requirements,  objective  measures  can  be  computed 
automatically.  While  a  number  of  researchers  have  developed  their  own  software  to  evaluate 
fusion  performance  a  number  of  evaluation  tools  have  been  collated  into  software  modules. 

4. 3. 3.1  Objective  Evaluation  Software 

MATIFUS  is  a  downloadable  Matlab  toolbox  for  image  fusion.  It  is  a  collection  of  functions  that 
supports  image  fusion  operations  and  tools  have  been  developed  to  evaluate  objective  fusion 
performance.  Currently  the  toolbox  supports  multiresolution  decomposition  techniques  (Wavelet, 
Steerable  Pyramid,  Quincunx  Lifting  Scheme,  LAP  and  GRAD).  The  MATIFUS  multiresolution 
interface  allows  the  manipulation  of  the  fusion  algorithm  parameters  such  as  filters,  bands, 
weighting  coefficients  etc  -  see  Figure  37. 


Figure  37:  Control  panel  of  MATIFUS 

Canga  (2003)  developed  a  Matlab  fusion  toolkit  as  part  of  his  final  year  project  at  the  University  of 
Bath.  The  toolkit  includes  a  number  of  fusion  techniques  and  evaluation  metrics. 
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A  third  Matlab  image  fusion  application  is  called  the  Image  Fusion  Toolbox  by  Metapix.  The 
toolbox  includes  the  following  fusion  methods:  linear  supeiposition,  PC  A  weighted  supeiposition, 
select  minimum  value,  select  maximum  value,  LAP,  sd  pyramid,  ratio  pyramid,  GRAD,  DWT 
using  DBSS  (2,2)  wavelets,  SiDWT  using  haar  wavelet  and  MORPH.  As  with  MATIFUS  the 
functions  are  accessed  through  a  graphical  interface  which  allows  image  input,  output  and 
parameter  manipulation  -  see  Figure  38. 


Figure  38:  Control  panel  for  the  Image  Fusion  Toolbox 

A  second  image  fusion  toolbox  was  developed  by  the  Canadian  Company  -  Airborne  Underwater 
Geophysical  Signals  (AUG  Signals)  for  use  with  Matlab  and  IDL.  The  toolbox  contains  traditional 
as  well  as  unique  AUG  Signals  fusion  approaches.  AUG  Signals  Electro-optical  Remote  Sensing 
Software  is  a  complete  software  package  incorporating  a  graphical  user  interface  (see  Figure  39), 
fusion  algorithms,  and  advanced  image  processing  methods  (including  image  registration, 
detection,  region  classification,  restoration,  fusion,  and  hyperspectral  data  analysis.) 
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Figure  39:  Control  panel  for  AUG  Signals  Image  Fusion  Toolbox 

Another  open  source  compendium  of  tools  for  image  fusion  is  the  Generalised  Image  Fusion 
Toolkit  (GIFT)  from  Mueller,  Maeder,  and  O'Shea.  (2006).  GIFT  currently  implements  quadrature 
mirror  filter  discrete  wavelet  transform  (QMF  DWT)  multi-scale  fusion  algorithms.  GIFT  is  built 
upon  the  Insight  Toolkit  (ITK)  which  is  an  open-source  software  system  able  to  perform  a  range  of 
registration  and  segmentation  algorithms  in  two-  or  three-dimensions.  Although  GIFT  currently 
only  implements  pixel-level  multi-scale  image  fusion,  efforts  are  underway  to  add  image  fusion 
metrics  into  the  program.  Possible  metrics  include  the  root  mean  square  error  (RMSE)  between  a 
“ground-truth”  image  and  the  fused  image,  the  RMSE  between  the  input  images  and  the  fused 
image,  image  entropy,  mutual  information,  spatial  frequency,  edge  strength  and  orientation,  and  the 
image  quality  index  (IQI). 

4. 3. 3. 2  Objective  Evaluation  Metrics 

A  number  of  fusion  evaluation  approaches  are  based  upon  objective  metrics  developed  for  simple 
image  quality  assessments.  Image  quality  metrics  are  used  by  manufacturers  in  the  design  and 
development  of  scanners,  printers,  digital  cameras  and  displays.  A  number  of  objective  methods 
have  been  developed  to  evaluate  components  of  image  quality,  i.e.  granularity  and  visually 
weighted  mean  square  error  is  used  to  predict  stochastic  noise  (Farrell,  1999).  Distortion  metrics 
have  also  been  developed  to  predict  visual  performance  with  test  targets  and  patterns.  These 
metrics  typically  require  an  ideal  or  “ground  truth”  image  in  which  to  compare  manipulation 
performance.  To  support  this  effort,  the  image  engineering  community  has  developed  a  large 
database  of  “ground  truth”  test  images  to  assess  compression  performance. 

Objective  metrics  have  also  been  developed  to  assess  fusion  performance.  Unlike  traditional  image 
quality  metrics  which  use  a  “ground  truth”  image,  ideal  fusion  images  are  not  available.  Adjusting 
fusion  filter  bands,  decomposition  levels,  weighting  parameters,  window  sizes,  etc.  will  affect 
fusion  performance. 
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A  large  number  of  objective  measures  have  been  proposed  to  evaluate  fusion  performance,  these 
include  Root  Mean  Square  Error  (RMSE),  Image  Quality  (QW),  Fusion  Quality  Measure  (Q)  to 
name  a  few.  Xiaochun  and  Chin  (2005)  classified  objective  measure  into  four  categories: 

•  Methods  based  on  statistical  characteristics; 

•  Methods  based  on  definition; 

•  Methods  based  on  information  theory;  and 

•  Methods  based  on  important  features. 

Please  see  Chen  and  Blum  (2000);  Zheng  et  al.  (2004);  Xiaochun  and  Chen  (2005);  Wang,  Shen, 
Zhang  and  qui  Zhang  (2003);  Tian,  Chen  and  Zhang  (2004);  Tsagaris  and  Anastassopoulos  (2006); 
Zheng  et  al.  (2005);  Piella  (2004)  for  other  articles  detailing  objective  evaluation  metrics. 

Objective  evaluation  approaches  will  be  described  in  greater  detail  in  the  following  section. 

4.3.3.2.1  Objective  Evaluation  Using  Statistical  Characteristics 

A  number  of  objective  measures  that  utilize  spectral  information  are  available  to  assess  fusion 
performance.  Some  of  these  measures  such  as  Spatial  Frequency  (SF),  Root  mean  Square  Error 
(RMSE),  Peak  Signal  to  Noise  ratio  (PSNR),  Image  Quality  Index  (IQI)  and  entropy  require  the 
use  of  a  ground  truth  image  to  derive  their  measure.  As  stated  earlier  this  is  not  the  case  for  most 
fusion  applications;  the  ideal  fusion  image  is  not  known.  The  following  is  a  non- exhaustive  list  of 
statistical  objective  measures  which  use  spectral  information  to  evaluate  fusion  perfonnance. 

Please  note  that  only  approaches  which  do  not  require  the  use  of  “ideal”  images  are  included.  : 

Standard  Deviation:  Differences  between  the  average  gray  value  (reflects  average  intensity  to 
vision)  of  the  fused  image  are  compared  to  the  fusion  image.  It  is  believed  that  higher  standard 
deviations  correlate  with  better  vision.  Additionally  better  vision  correlates  to  an  average  gray 
value  of  128. 

Contorted  value  of  spectral  image:  This  measure  reflects  the  spectral  distortion  of  the  fusion  image. 
This  is  computed  by  determining  the  absolute  difference  between  the  fusion  image  and  the  original 
images.  Better  fusion  results  in  a  lower  contorted  value  of  image  spectral. 

Spectral  correlation:  This  measure  is  utilized  in  wavelet  decomposition  methods,  where 
correlations  in  the  horizontal,  vertical  and  diagonal  directions  are  determined  between  the  fusion 
image  and  the  source  images.  Better  fusion  is  believed  to  correspond  to  higher  spectral 
correlations. 

Standard  gray  scale  difference:  This  measure  corresponds  to  the  contrast  of  the  fusion  image.  The 
distribution  of  the  gray  values  across  the  image  is  compared  to  the  mean  gray  value.  The  measure 
predicts  the  enhancement  of  contrast. 

Fechner-Weber  contrast  measure:  Fechner’s  law  states  that  the  sensation  increases  with  the 
logarithm  of  the  stimulus.  The  human  retina  corrects  all  sensor  values  using  a  local  comparison 
with  the  mean  response  from  the  receptor  neighbours  (Rojas,  2007) 

Target  Interference  Ratio  (TIR)/Target-Background  Interference  Ratio  (TBIR):  This  measure  is 
based  on  the  assumption  that  if  a  target  contrasts  highly  with  its  background,  it  will  be  easier  to 
find.  While  TIR  and  TBIR  indicate  the  separability  of  a  target  from  its  background,  the  TBIR 
favours  uniform  targets  against  uniform  backgrounds  while  the  TIR  does  not  (Peters  &  Strickland, 
1990). 

Fisher  Distance:  Discriminant  function  analysis  is  used  to  determine  which  variables  discriminate 
between  two  or  more  naturally  occurring  groups.  The  Fisher  Finear  Discriminant  maps  many 
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dimensions,  i.e.  source  and  fusion  images  on  to  one.  The  resulting  quadratic  equation  or  Fisher 
distance  (Fisher’s  vector)  provides  a  linear  estimate  of  differences.  The  optimum  fusion  process 
should  possess  the  highest  Patrick-Fisher  contrast  distance. 

Fractal  dimensions:  Fractal  dimensions  can  describe  the  abundance  degree  of  texture  characteristics 
and  the  variety  of  pixel  values 

Image  Noise  Index  (INI)  is  used  as  an  index  to  create  a  clear  picture  of  the  improvement  or 
deterioration  of  the  fused  image.  If  INI  is  positive  there  is  an  improvement  in  the  quality  of  the 
fused  picture.  INI  is  related  to  the  signal-noise  ratio  and  utilizes  the  image  entropy  of  the  original, 
fused  and  restored  image. 

Signal  Noise  ration  (SNR)  Estimation  (QS):  This  metric  estimates  the  noise  and  blurring  in  images. 

Mannos-Sakrison’s  Filter  Metric:  This  metric  is  used  to  compare  the  fusion  image  with  the  source 
images  in  the  frequency  domain.  The  model  is  sensitive  to  middle  range  frequencies.  (Chari  et  al. 
2005) 

4.3.3.2.2  Objective  Evaluation  Based  on  Definition 

In  addition  to  statistical  classification  methods,  Xiaochun  and  Chen  (2005)  classify  objective 
metrics  according  the  geometric  detail  of  the  fusion  image.  Xiaochun  and  Chen  identify  three 
definition  parameters,  average  grads,  spatial  frequency  and  wavelet  energies. 

Average  grads:  This  approach  compares  how  well  the  locations  within  images  compare  to  each 
other.  The  measure  is  used  to  evaluate  the  image’s  degree  of  clarity.  Increased  sensitivity  to  small 
details  is  reflected  in  higher  gradient  scores.  The  gradient  reflects  the  contrast  differences  between 
images,  especially  at  edges  where  image  gradients  are  strongest. 

Spatial  frequency  (SF):  Spatial  frequency  is  used  to  measure  the  overall  activity  level  of  an  image. 
Wavelet  Energies:  This  metric  is  based  upon  wavelet  energy  after  image  decomposition. 

4.3.3.2.3  Objective  Evaluation  Based  on  Information  Theory 

Fusion  performance  can  be  assessed  using  information  theory,  in  that  fusion  images  should  contain 
more  information  than  their  source  images.  Information  entropy  can  measure  the  extent  of  image 
spectral  information.  Entropy  is  detennined  by  evaluating  the  information  content  of  an  image. 
Entropy  is  sensitive  to  noise  and  other  unwanted  fluctuations. 

Correlation  information  entropy:  This  is  a  constructed  parameter  which  assesses  information 
overlap  between  source  and  fused  images. 

Cross  Entropy:  Cross  entropy  denotes  the  correlation  extent  of  information  between  images.  The 
better  the  fusion  process  the  higher  the  correlation  and  cross  entropy. 

Union  Entropy:  Union  entropy  is  the  measurement  of  information  correlation  and  union 
information  amongst  multi-images. 

Image  Entropy:  This  represents  the  amount  of  information  that  is  transferred  from  the  source 
images  to  the  final  fusion  image.  Image  entropy  does  not  take  into  account  the  overlap  of 
information  from  the  source  images.  While  the  image  entropy  metric  makes  it  difficult  to  compare 
different  data  sets,  it  is  used  to  assess  fusion  approaches  which  operate  globally  (averaging  or 
PCA)  as  well  as  methods  which  focus  on  detailed  content  (DWT  and  morphological  fusion 
approaches). 
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Image  Fusion  Performance  Measure  (IFPM):  this  measure  utilizes  the  mutual  information  as  well 
as  conditional  information  to  evaluate  the  amount  of  information  transferred  from  the  source 
images  to  the  final  fusion  image.  This  measure  utilizes  common  information  only  once  in  its 
calculation. 

Information  Based  Measure  (MI):  Mutual  information  represents  the  amount  of  information  that  is 
transferred  from  the  source  images  to  the  final  fusion  image. 

4.33.2.4  Objective  Evaluation  Based  on  Important  Features 

There  are  two  general  categories  of  important  features  found  in  fused  images,  edges  and  regions  of 
interest. 

Evaluation  method  based  on  protected  factor  of  edge  information 

Petrovic  and  Xideas  (2005)  developed  a  metric  (QE)  which  evaluates  the  amount  of  edge 
information  that  is  transferred  from  source  images  to  the  fused  image.  A  Sobel  detector  is  used  to 
detect  edges,  field  strengths  and  orientations.  This  metric  evaluates  the  loss  of  edge  information 
between  the  fusion  image  and  the  source  image.  This  measure  is  based  upon  the  theory  that  the 
human  visual  system  resolves  uncertainty  by  extracting  information  contained  in  illuminated 
variations  or  edges  rather  than  actual  signal  values.  The  measure  thus  uses  only  edge  information 
and  not  regional  information. 

Edge  Dependent  Fusion  Quality  Index  (QE):  The  edge-dependent  fusion  quality  index  uses  both 
images  and  edges  to  determine  its  value.  In  this  measure  edges  get  higher  weight. 

Evaluation  method  based  on  image  quality 

Fusion  Quality  Measure/Index  (Q):  The  fusion  quality  measure  reflects  how  much  salient 
information  contained  in  each  of  the  input  images  has  been  transferred  into  the  composite  image. 

In  this  measure  all  areas  of  the  image  are  treated  equally  but  this  is  contrary  to  human  vision  where 
some  regions  have  higher  importance,  i.e.  a  tree-line  bordering  on  an  open  field. 

Weighted  Fusion  Quality  Index  (QW):  The  weighted  fusion  quality  index  reflects  how  much 
salient  information  contained  in  each  of  the  input  images  has  been  transferred  into  the  composite 
image.  In  this  measure  areas  which  are  perceptually  important  get  higher  weight. 

Universal  Image  Quality  Index  (UIQI);  this  quality  metric  is  based  upon  the  Wang  and  Bovik 
(2002)  Structural  Similarity  (SSIM)  measure.  The  UIQI  gives  an  indication  (Qb)  of  how  much  of 
the  salient  information  contained  in  the  source  images  is  transferred  to  the  fusion  image.  (Cvejic, 
Loza,  Bull  and  Canagarajah,  2005) 

Visual  Difference  (VDA):  This  measure  evaluates  fusion  performance  as  the  total  area  affected  by 
visible  differences  in  the  fused  image  as  compared  to  source  images. 
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5  Discussion 


5.1  Hardware  -  Sensors 

The  latest  image  sensor  market  is  strongly  driven  by  the  camera  cell  phone  and  digital  still  camera 
applications  and  is  moving  toward  the  larger  number  of  pixels  and  the  smaller  pixel  size 
(Mizobuci,  Adachi,  Yamashita,  Okamura,  Oshikubo,  Akahane,  and  Sugawa,  2007).  This  drive  has 
resulted  in  significantly  smaller  image  sensors.  More  specific  to  military  applications,  there  has 
been  a  recent  push  for  NVG  sensor  improvements  that  has  been  driven  by  operational  requirements 
formed  by  military  NVG  users.  The  top  desired  enhancements  for  NVGs  are  listed  below  (Estrera 
et  al.  2003): 

•  Multispectral  image  fusion; 

•  Lighter  weight; 

•  Smaller ; 

•  Reduced  power  consumption; 

•  Higher  resolution; 

•  Increased  range; 

•  Facilitate  individual  movement  techniques; 

•  Colour  image;  and 

•  Image  reliability. 

This  list  highlights  some  deficiencies  with  the  current  NVG  technologies.  In  terms  of  a  helmet 
based  sensor  fusion  system,  the  first  three  enhancements  need  to  be  immediately  addressed. 

5.1.1  Sensor  Criteria 

There  are  a  number  of  important  specifications  to  consider  in  selecting  sensors  for  image  fusion.  In 
this  particular  application  to  SIHS,  specifications  that  need  to  be  considered  with  the  ideal 
requirements  are: 


Table  6:  SIHS  Sensor  Specifications 


Specification 

Ideal 

Size  (length,  width,  height) 

Small,  small  enough  to  mount  on  helmet, 
approx  2x3x2  inches 

Weight 

Lightweight,  approx  2lbs  or  less 

Resolution 

Average  to  good,  at  least  640  x  480  pixels 

Real-time  image  capture 

Frame  rate  at  least  30  Hz 

Sensitivity 

High  sensitivity 
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To  be  considered  real-time  processing,  the  frame  rate  was  set  to  be  at  least  30  frames  per  second 
(fps  or  Hz).  The  frame  rate  is  usually  dependent  on  the  resolution.  The  higher  the  resolution,  the 
lower  frame  rate;  and  the  higher  the  frame  rate  the  lower  the  resolution.  It  should  be  noted 
“Maximum  Frame  Rate”  column  in  Annex  A  the  maximum  frame  rate  is  reported,  that  is  not 
necessarily  correlated  to  the  resolution.  This  correlation  was  taken  into  account  in  the  analysis.  To 
satisfy  the  specification  a  sensor  would  have  to  have  at  least  a  640  x  480  resolution  and  at  least  30 
fps. 

These  characteristics  will  be  considered  in  the  evaluation  of  the  sensors  presented  below.  There  are 
several  other  parameters  to  consider,  that  are  not  as  important  in  this  evaluation.  They  are  listed 
below  for  reference  for  future  consideration: 

•  Performance  features  (ex.  Gain  control,  cooled,  high  speed,  anti-blooming); 

•  Physical  features; 

•  Lens  mounting; 

•  Shutter  control; 

•  Operating  environment;  and 

•  Battery  life. 

The  anti-blooming  or  “halo  free”  is  an  ideal  feature  to  have.  It  is  a  severe  deficiency  of  I2  sensors 
when  they  encounter  bright  light.  It  results  in  the  loss  of  imaging  information  due  to  I2  halo, 
especially  at  long  distances  (Estrera,  Ostromek,  Bacarella,  Isvell,  Iosue,  Saldana,  Beystrum,  2002). 
Figure  40  shows  the  difference  between  having  the  anti-blooming  feature  and  not  having  it  in  an 
intensified  image  of  people  around  a  vehicle  with  bright  light  sources. 


Figure  40:  Comparison  of  anti-blooming  or  halo  free  feature  (Estrera  et  al,  2003) 

Image  fusion  of  I2  and  IR  sensors  is  of  great  importance.  A  comprehensive  table,  taken  from 
Estrera  et  al.  (2003)  shows  the  positive  and  negative  aspects  of  each  individual  sensor  and  the 
fusion  result.  Image  fusion  allows  the  positive  aspects  of  each  sensor  to  operate,  while  the  negative 
aspects  may  not  be  present. 
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Table  7:  Comparison  of  Positive  and  Negative  Aspects  of  I2  and  IR  Fused  Sensors 

(Estrera  et  al.  2003) 


Negatives 

Positives 

P/IR 

Characteristic 

I2/IR 

Characteristic 

Fusion  Result 

I2 

No  Laser  D  esignator 
Detection 

Good  Laser  Designator 

Detection 

I2 

No  Total  Darkness 

Good  Total  Darkness  vision 

I2 

Poor  Long  Range  Vehicle 
ID 

Good  Long  Range  Vehicle  ID 

I2 

No  Thermal  Contrast 
Detection 

I2 

Excellent  Low  Thermal 
Contrast  T  arget  D  etection 

Good  Thermal  Contrast 

Detection 

Good  Tow  thermal  contrast 
detection 

I2 

Excellent  Artificial  Light 
detection 

Good  Artificial  Light  detection 

I2 

Excellent  Shadow 
Detection 

Good  Shadow  Detection 

I2 

Excellent  Silica  Glass 
Penetration 

Good  Silica  Glasss  Penetration 

NOTE:  1 2 1  Image  Inlenslfler),  IRi  Infrared)  defined  as  NIR.  SIVIR,  MWIR.  L IVIR 


5.1.2  Sensor  Evaluation 

A  binary  evaluation  was  done  on  each  type  of  sensor  to  establish  whether  or  not  it  met  the 
specification  criteria.  A  summary  of  the  evaluations  is  presented  in  the  subsequent  tables.  If  the 
sensor  met  the  specification,  it  is  indicated  by  a  ‘V”  and  if  did  not  it  is  indicated  by  a  Items 
that  do  not  have  an‘V”  or  “x”  indicate  there  was  missing  information  for  that  specification. 


Table  8:  LLLTV  Evaluation 


Manufacturer 

Sensor 

Size 

Weight 

Resolution 

Real 

Time 

Sensitivity 

TOTAL 

DVC 

DVC-1412  Series  - 
DVC-1412AM-MS 

✓ 

V 

V 

X 

✓ 

4 

DVC 

DVC-1412  Series  - 
DVC-1412AM-MT 

V 

V 

X 

4 

DVC 

DVC-1412  Series  - 
DVC-1412AC-00 

•/ 

V 

V 

X 

•/ 

4 

DVC 

DVC-1412  Series  - 
DVC-1412AC-TE 

•/ 

V 

V 

X 

•/ 

4 
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Table  8:  LLLTV  Evaluation  (continued) 


Manufacturer 

Sensor 

Size 

Weight 

Resolution 

Real 

Time 

Sensitivity 

TOTAL 

Intevac 

E2006  Low  Light 

Level  Camera 

✓ 

✓ 

✓ 

✓ 

5 

NAC  Image 
Technology 

HSV  High-Speed 

Color  Video  System 
-  HSV-500C3 

V 

V 

X 

X 

3 

NAC  Image 
Technology 

Memrecam  fx  -RX5 
Micro  Camera  Head 

V 

S 

V 

V 

X 

4 

PCO  Imaging 

PCO  Series-  Model 
PCO. 1600 

X 

X 

V 

V 

2 

PCO  Imaging 

PCO  Series  -Model 
PC0.2000 

X 

X 

✓ 

X 

2 

PCO  Imaging 

PCO  Series  -  Model 
PC0.4000 

X 

X 

✓ 

X 

2 

PCO  Imaging 

Pixelfly  Series  - 
Model  Pixelfly 

✓ 

✓ 

✓ 

/ 

4 

PCO  Imaging 

Pixelfly  Series  - 
Model  Pixelfly  QE 

V 

V 

✓ 

X 

3 

PCO  Imaging 

Sensicam  Series  - 
Model  Sensicam  EM 

V 

X 

✓ 

X 

2 

PCO  Imaging 

Sensicam  Series  - 
Model  Sensicam  QE 

•/ 

X 

✓ 

X 

2 

PCO  Imaging 

Sensicam  Series  - 
Model  Sensicam  QE 
Double  Shutter 

X 

✓ 

X 

2 

Intevac’s  E2005  LLL  camera  met  the  specifications  for  the  SIHS  application.  PCO  Imaging 
Pixelfly  met  all  the  specifications,  however,  information  was  missing  regarding  the  sensitivity. 


Table  9:  CMOS  Evaluation 


Manufacturer 

Sensor 

Size 

Weight 

Resolution 

Real 

Time 

Sensitivity 

TOTAL 

Intevac 

NightVista  E2010 

■/ 

✓ 

✓ 

4 

Intevac 

NightVista  E3010 

✓ 

✓ 

3 

Irvine  Sensors 

Corp. 

MVC-FF0229 

✓ 

X 

2 

PCO  Imaging 

PCO  Series  --  Model 
PCO. 1200  HS 

X 

✓ 

2 

PCO  Imaging 

PCO  Series  --  Model 
PCO. 1200  S 

X 

V 

V 

2 

Prosilica  Inc. 

GE640C  -  02- 
2001 A 

✓ 

X 

V 

2 

Vision  Research 

Inc 

High  Speed  Camera 
--  Phantom®  v6.2 

X 

V 

X 

2 

Unfortunately,  most  of  the  CMOS  sensors  were  missing  information  regarding  weight  and  light 
sensitivity.  From  the  above  analysis,  Intevac’s  E2010  is  the  most  suitable  camera  for  the  SIHS 
application  because  it  met  all  of  the  specifications. 
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Table  10:  SWIR  Evaluation 


HUMAN  Sr STEMS 


Manufacturer 

Sensor 

Size 

Weight 

Resolution 

Real 

Time 

Sensitivity 

TOTAL 

FLIR 

SC4000  HS-NIR / 

SC6000  HS-NIR 

V 

S 

2 

Intevac 

LIVAR  400  Short  Wave 
Infrared  Camera 

V 

V 

V 

3 

Sensors 

Unlimited  Inc 
(Goodrich) 

SU640SDV-1.7  RT 
SU640SDV  Vis-1.7  RT 
High  Resolution 

InGaAs  and  Vis- 
InGaAs  SWIR  Area 
Cameras 

X 

X 

V 

V 

2 

Sensors 

Unlimited  Inc 
(Goodrich) 

SU640SDWH-1.7  RT 
SU640SDWHVIS-1.7 

RT  High  Resolution 
Windowing  InGaAs  and 
Vis-InGaAs  SWIR  Area 
Cameras 

X 

X 

S 

2 

Sensors 

Unlimited  Inc 
(Goodrich) 

SU320KTX-1.7RT  High 
Sensitivity  InGaAs 

SWIR  Camera 

X 

X 

✓ 

2 

Sensors 

Unlimited  Inc 
(Goodrich) 

SU320M-1.7RT 

InGaAs  SWIR  Camera 

✓ 

✓ 

X 

V 

3 

Sensors 

Unlimited  Inc 
(Goodrich) 

SU320MX-1.7RT  High 
Sensitivity  InGaAs  NIR 
MiniCamera 

V 

s 

X 

S 

3 

Sensors 

Unlimited  Inc 
(Goodrich) 

SU320MVis-1.7RT 

Visible  and  SWIR 
Response  InGaAs  NIR 
MiniCamera 

s 

s 

X 

s 

3 

Sensors 

Unlimited  Inc 
(Goodrich) 

SU320MSVis-1.7RT 
Visible  and  SWIR 
Response  InGaAs 
MiniCamera 

V 

V 

X 

V 

3 

Lumitron 

NIR320  OEM  InGaAs 
Near-Infrared  Camera 

X 

X 

X 

V 

1 

There  was  no  information  provided  for  the  thermal  sensitivity  of  the  SWIR  sensors.  The 
evaluation  above  shows  that  Sensors  Unlimited  has  four  cameras:  SU320M-1.7RT  InGaAs  SWIR 
Camera,  NIR  MiniCamera,  and  SU320MVis-1.7RT  Visible  and  SWIR  models  met  most  (3  out  of 
5)  specifications  for  the  SIHS  application.  However,  these  sensors  did  not  quite  meet  the  resolution 
specification  as  they  only  had  a  320  x  240  resolution  where  the  minimum  desired  is  640  x  480. 

The  other  Sensors  Unlimited  cameras  met  the  resolution  and  real  time  requirements,  however  were 
slightly  big  (6”x3”x3”).  Intevac’s  LIVAR  400  Short  Wave  Infrared  Camera  would  be  the  most 
suitable  for  SIHS  (for  resolution  and  real  time),  as  it  also  met  three  specifications  but  was  missing 
information  for  the  weight. 
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Table  1 1 :  MWIR  and  LWIR  Evaluation 


H  U  M  A  N  V  l S  TK  MS 


Manufacturer 

Sensor 

Size 

Weight 

Resolution 

Real 

Time 

Sensi 

tivity 

TOTAL 

Uy'JJ 

UTWS  (Urban  Thermal  Weapon 

X 

S 

X 

✓ 

✓ 

3 

DRS  NVEC 

Sight  II) 

CSTWS  (Crew  Served  Thermal 

X 

X 

✓ 

✓ 

3 

DRS  NVEC 

Weapon  Sight  II) 

MX-2  (Rugged  miniature  thermal 

X 

X 

X 

•/ 

X 

1 

DRS  NVEC 

imager) 

COBRA-IR  (Covert  Over  Barrier 

X 

V 

X 

•/ 

X 

2 

DRS  NVEC 

Recon  Assistant  -  Infrared) 

DRS  NVEC 

HelmetIR 

✓ 

v 

X 

X 

X 

2 

Mini-IR  Plus  (hand-held  thermal 

X 

X 

/ 

X 

1 

DRS  NVEC 

imager) 

DRS  NVEC 

PVS-7  Style  (Single  tube  NVG) 

X 

0 

MANTIS  (Multi-Adaptable  Night 

1 

DRS  NVEC 

Tactical  Imaging  System) 

4x  Raptor  (4-power  night  vision 

X 

s 

1 

DRS  NVEC 

weapon  sight) 

6x  Raptor  (6-power  night  vision 

X 

X 

0 

DRS  NVEC 

weapon  sight) 

ELCAN 

SpecterlR+ 

X 

0 

(Raytheon) 

FLIR 

ThermoVision  Photon 

✓ 

V 

X 

2 

FUR 

ThermoVision  ThermoSight 

1 

FUR 

ThermoVision  A10 

✓ 

X 

1 

SC4000  HS  MWIR/  SC6000  HS- 

✓ 

V 

V 

3 

FUR 

MWIR 

SC4000  HS  LWIR /  SC6000  HS- 

✓ 

V 

V 

3 

FUR 

LWIR 

Personal  Miniature  Thermal 

✓ 

✓ 

X 

s 

v 

4 

Irvine  Sensors 

Viewer 

Irvine  Sensors 

CAM-NOIR  Thermal  Camera 

X 

0 

Irvine  Sensors 

Miniature  Camera 

V 

1 

L3 

X 

V 

X 

■/ 

•/ 

3 

Communications 

Thermal-Eye 

X200xp 

L3 

✓ 

X 

X 

3 

Communications 

Thermal-Eye 

3600AS 

L3 

X 

X 

✓ 

✓ 

3 

Communications 

Thermal-Eye 

362 OAS 

L3 

✓ 

X 

X 

✓ 

3 

Communications 

Thermal-Eye 

3640AS 

UC320U  OEM  Microbolometer 

✓ 

•/ 

X 

s 

X 

3 

Lumitron 

Infrared  Camera 

UC320D  OEM  Microbolometer 

X 

✓ 

X 

X 

2 

Lumitron 

Infrared  Camera 

Many  of  the  sensors  were  missing  information  regarding  thermal  sensitivity.  From  the  above 
analysis,  Irvine’s  Personal  Miniature  Thermal  Viewer  met  4  out  of  5  of  the  desired  specifications. 
However,  it  did  not  quite  meet  the  resolution  specification  as  it  is  only  320  x  240.  Other  suitable 
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candidates  considering  the  30  fps  and  640  x  480  resolution  was  FLIR  cameras  SC4000  HS  MWIR/ 
SC6000  HS-MWIR  and  LWIR.  The  FLIR  cameras  data  sheets  did  not  provide  any  information  on 
size  or  weight,  otherwise  it  satisfied  all  other  specifications. 


Table  12:  EBAPS  Evaluation 


Manufacturer 

Sensor 

Size 

Weight 

Resolution 

Real 

Time 

Sensitivity 

TOTAL 

ill'll 

Intevac 

NightVista 

✓ 

2 

Intevac 

ISIE6 

✓ 

2 

Intevac 

ISIE10 

V 

2 

There  was  a  great  deal  of  missing  information  on  the  data  sheets  provided  by  Intevac  for  the 
EBAPS.  Model  ISIE10  is  still  in  its  development  stages.  As  size,  weight,  and  sensitivity  are 
unknown,  it  cannot  be  determined  whether  any  of  these  sensors  is  suitable  for  SIHS.  However,  the 
frame  rates  did  not  meet  the  desired  specification  for  NightVista,  ISIE6,  and  ISIE10,  as  they  were 
only  30,  27.5,  and  37  respectively. 

All  sensor  technical  datasheets  recommended  for  SIHS  are  in  Annex  C,  except  for  EBAPS  as  there 
were  none  available. 

5.2  Hardware  -  Fusion  Boards 

An  important  capability  for  fielded  image  fusion  systems  is  computational  co-registration. 

Dynamic  scenes  typically  have  foreground/background  objects  in  relative  motion,  there  is  no  single 
computational  mapping  for  visible/thermal  infrared  cameras  with  any  degree  of  parallax  that  will 
bring  both  image  inputs  into  exact  alignment  at  all  times  (Wolff  et  al.  2006).  The  new  Equinox 
DVP-4000  allows  for  co-registration  while  previous  models  did  not  and  does  not  require  as  much 
power  as  the  previous  Equinox  models.  The  Imagize  FP-3500  uses  a  closed  source  algorithm 
approach  while  the  Equinox  DVP-4000  uses  an  open  source  approach  with  algorithms  developed 
by  Waterfall  Solutions  (Surrey,  England).  However,  for  an  open  source  desk  top  application  for 
image  fusion  the  Octec  ADEPT60  is  widely  used  and  features  multiple  algorithm  capability  with 
multiple  analog  and  digital  inputs.  Due  to  the  lack  of  literature  and  specifications  present  on  the 
image  fusion  processors  a  full  criterion  based  evaluation  cannot  be  performed  on  the  available 
processors. 
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INTENSIFIED  VISNIR 
CHANNEL 


THERMAL  INFRARED 
CHANNEL 


(!>)  With  computational  co-registration 


Figure  41:  Example  of  co-registration  (Wolff  et  al,  2006) 


5.3  Fusion  Algorithms 

All  the  previously  discussed  algorithms  have  advantages  and  disadvantages.  A  description  of  the 
specific  advantages  and  disadvantages  of  each  algorithm  is  available  in  Table  13.  According  to 
several  studies,  the  SiDWT  generally  outperforms  all  other  algorithms  in  both  subjective  and 
objective  measures.  The  simple  averaging  technique,  albeit  easy  to  use  it,  all  large  contrast  values 
will  be  suppressed  in  the  resulting  fused  image.  Using  the  PC  A  the  fused  image  tends  to  be  of 
lesser  quality  than  the  input  images  due  to  the  common  selection  of  only  the  1 st  Eigen  value  for  the 
fused  image.  The  GRAD  and  RoLP  have  decreased  image  sharpness  and  creates  spots  on  the  fused 
image.  A  simple  DWT  algorithm  produces  favourable  subjective  and  objective  results  but  produces 
an  image  that  is  not  shift-invariant.  Even  though  it  may  have  increased  computational  and  time 
demands  the  SiDWT,  or  variations  of  it,  is  the  algorithm  that  may  produce  the  best  fusion  results 
for  night  time  imagery.  The  LAP  also  produces  very  good  results  for  image  fusion  using  night 
time  imagery  but  it  has  the  potential  of  changing  the  exact  location  of  objects  in  the  fused  image 
when  compared  to  the  input  images.  Nonetheless,  the  LAP  is  the  highest  performing  pyramid  based 
fusion  algorithm.  However,  depending  on  the  type  of  objective  tests  and  the  type  of  input  imagery 
the  most  desirable  algorithm  may  change  but  the  common  conclusion  from  the  literature  is  the 
SiDWT  and  LAP  tend  to  outperform  the  other  methods. 
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Table  13:  Advantages  and  Disadvantages  of  Specific  Algorithms 


Algorithm 

Advantages 

Disadvantages 

Pixel-Level 

Simple  Averaging 

low  computational  demand 
decreased  time 
simple 

used  as  benchmark  for  comparison 
of  other  fusion  methods 

-  high  contrast  pixel  values  in  input 
image  are  depressed  in  value  in  the 
fused  image 

PCA 

selects  optimal  weighting  coefficients 
based  on  information  content 
removes  redundancy  present  in  input 
image 

compresses  large  amounts  of  inputs 
without  much  loss  of  information 

-  usually  selects  1 st  Eigen  value  which 
does  not  contain  all  of  the  patterns 
between  inputs 

-  fused  image  will  be  of  lesser  quality 
than  any  of  the  input  images 

-  strong  correlation  between  the  input 
images  and  fused  image  is  needed 

Pyramid  Based 

LAP 

most  frequently  studied  pyramid 
transform 

produces  favourable  results:  both 
subjective  and  objective 
able  to  pre-determine  which  pixels 
are  used  in  the  fused  image 

-  decomposes  images  by  a  factor  of  2, 
which  restricts  the  composition  of  the 
fused  image 

-  does  not  distinguish  between  material 
edges  and  temperature  edges,  which 
may  create  an  abundance  of 
information  and  clutter  the  scene  in 
the  fused  image 

-  may  alter  exact  location  of  objects  in 
the  fused  image 

MORPH 

removes  image  details  without 
adding  any  gray  scale  bias  or  altering 
location 

can  extract  objects  of  a  certain  size 
from  an  image 

-  decomposes  images  by  a  factor  of  2, 
which  restricts  the  composition  of  the 
fused  image 

-  performs  worse  than  the  LAP  in 
subjective  tests 

GRAD 

produces  horizontal,  vertical,  and 
diagonal  pyramid  sets  compared  to 
just  horizontal  and  vertical 
improved  temporal  stability  over  the 
LAP 

transfers  a  greater  amount  of  salient 
information  when  compared  to  the 
MORPH 

-  decrease  in  visual  clarity  when 
identifying  targets 

-  decrease  in  sharpness  when 
compared  to  LAP 

-  does  not  transfer  as  salient 
information  as  the  LAP 

RoLP 

encodes  absolute  luminance 
contrasts  compared  to  absolute 
luminance  differences  in  the  LAP 

-  Performs  inferior  to  LAP,  GRAD, 
MORPH  in  objective  performance 
measures 

-  Decreased  image  sharpness  when 
compared  to  LAP  and  GRAD 

-  Produces  algorithm-created  spots  in 
fused  image 
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Table  14:  Advantages  and  Disadvantages  of  Specific  Algorithms  (continued) 


Wavelet  Transforms 

Refer  to  Section  4.2.4  for  a  list  of  the 
advantages  of  wavelet  based  methods 
over  pyramid  based  methods 

DWT 

different  rules  are  applied  to  the  low 
and  high  frequency  portions  of  the 
signal 

Performs  favourably  when  compared 
to  other  fusion  methods 

-  is  not  shift-invariant 

-  pixel  by  pixel  analysis  is  not  possible 

-  not  possible  to  fuse  images  of 
different  sizes 

SiDWT 

is  shift  invariant 

improved  temporal  stability  over 

DWT 

redundancy  of  information  allowing 
better  detection  of  dominant  feature 
can  be  applied  to  any  sized  images 
for  fusion 

generally  outperforms  all  other 
methods  in  subjective  and  objective 
tests 

improved  sharpness  over  most 
pyramid  based  methods 

-  requires  increased  computational  and 
time  demands 

Feature-Level 

Edge  Detection 

extract  features  dependant  on 
changes  occurring  over  a  number  of 
pixels  rather  than  a  single  pixel 
non-linear  methods  are  able  to 
preserve  large-scale  edges  while 
removing  structures  smaller  than  a 
specified  window 

-  difficult  to  select  appropriate  size 
threshold  value  for  all  applications 

-  difficult  to  select  appropriate  window 
size  for  all  applications 

-  classical  methods  detect  edges  at 
only  a  single  resolution 

-  linear  models  are  likely  to  blur 
important  image  features  at  each 
decomposition  level 

5.4  Metric  Discussion 

A  large  number  of  objective  metrics  and  a  smaller  number  of  subjective  metrics  have  been 
developed  and  utilized  by  researchers  to  assess  fusion  performance.  The  ability  of  the  objective 
measures  to  predict  human  performance  is  a  known  concern  of  these  researchers.  Xiaochun  and 
Chen  (2005)  noted  that  the  use  of  objective  measures  is  practical  and  effective  only  if  the  results 
are  in  accordance  with  subjective  evaluation  results.  Farrell  (1999,  pg  286)  stated  that  “there  is  no 
single  image  quality  metric  can  predict  our  subjective  judgements  of  image  quality  because  image 
quality  judgements  are  influenced  by  a  multitude  of  different  types  of  visible  signals,  each 
weighted  differently  depending  on  the  context  under  which  a  judgement  is  made.” 

A  review  of  the  literature  indicates  that  many  of  the  earlier  “statistical  characteristic”  objective 
measures  failed  in  the  rigor  demanded  in  many  scientific  circles.  Concerns  with  early  objective 
measures  include  the  following: 

•  Objective  measures  were  based  on  static  assessment  of  fusion  and  source  images.  Real 
time  objective  measures  were  not  identified. 

•  The  objective  measures  cannot  be  applied  across  all  fusion  approaches. 

•  The  objective  measures  have  not  been  adequately  validated  with  human  performance. 

•  If  tested,  the  objective  measures  do  not  correlate  well  with  human  performance. 

•  The  objective  measures  are  task  and  condition  dependent. 
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•  Objective  measures  were  developed  with  static  images  and  are  not  currently  designed  for 
real-time  application. 

Measurement  is  the  process  observing  and  recording  the  observations  that  are  collected  as  part  of  a 
research  effort.  For  image  fusion,  researchers  have  suggested  a  variety  of  objective  measures  to 
assess  the  success  of  the  fusion.  Ideally  the  researcher  has  developed  a  theory  upon  which  to  base 
the  validity  of  their  measure  (theoretical  constructs).  Construct  validity  is  the  assessment  of  how 
well  the  researcher  translated  their  theories  into  actual  measures.  The  limited  review  of  the 
literature  did  not  identify  theoretical  constructs  for  many  of  the  older  statistical  objective  measures. 
Given  the  limitations  of  simple  metrics,  researchers  have  focussed  on  developing  metrics  based  on 
information  theory  and  human  perception  (important  features).  Petrovic  and  Xydeas  (2005) 
developed  a  metric  based  on  the  theory  that  “the  human  visual  system  (HVS)  resolves  uncertainty 
of  visual  stimuli  by  extracting  information  contained  in  illumination.  Variations,  that  is,  in  changes 
(edges)  rather  than  in  actual  signals”  (Petrovic  &  Xydeas,  2005  pg  2).  The  authors  reported  a  high 
correlation  between  subjective  ratings  and  objective  metrics  which  consider  the  preservation  of 
input  information  in  the  form  of  edge  parameters  (strength  and  orientation)  and  the  evaluation  of 
edge’s  perceptual  importance.  Convergent  support  for  the  validity  of  edge  dependent  measures 
was  reported  by  Chen  and  Blum  (2005).  In  their  study  Chen  and  Blum  evaluated  28  night  vision 
images  using  13  different  fusion  approaches.  Fusion  performance  was  assessed  using  a  limited 
expert  subjective  panel  and  seven  objective  measures.  Chen  and  Blum  reported  that  the  Objective 
Edge  Based  Measure  (QE)  provides  the  best  correlation  between  subjective  and  objective  results. 

The  scientific  method  requires  that  metrics  must  have  internal  and  external  validity.  Internal 
validity  relates  to  the  issue  where  a  fusion  approach  did  make  a  difference  in  operator  performance. 
External  validity  relates  to  generalizability,  “to  what  populations,  settings,  and  treatment  variables 
can  this  effect  be  generalized  (Campbell  and  Stanley,  1963  pg  5.)  Valid  fusion  performance 
measures  should  have  face  validity,  predictive  validity,  and  construct  validity.  The  SIHS-TD 
fusion  assessment  program  should  focus  on  measures  which  are  valid  and  meaningful. 

Concerns  with  real  time  objective  performance  assessment  could  be  overcome  by  a  variety  of 
means.  Videos  could  be  sampled  at  a  known  rate  and  each  sample  set  of  fusion  and  source  images 
could  be  evaluated  after  data  capture.  The  objective  performance  would  then  be  determined  by  an 
average  rating  over  the  data  set.  If  the  fusion  test  bed  is  fast  enough  then  it  may  be  possible  to 
conduct  objective  evaluations  in  real  time  or  near  real  time  with  a  minimum  of  lag. 

A  review  of  the  subjective  assessment  literature  examined  in  this  study  did  not  reveal  any  formal 
clinical  assessments  of  fusion  target  recognition  or  identification  performance  using  classical 
definitions  (Holst,  2000).  Because  the  classical  definitions  in  themselves  do  not  adequately  address 
human  or  other  urban  targets  adequately,  the  Night  Visions  and  Electronic  Sensors  Directorate 
(NVESD)  has  recommended  new  definitions  to  adequately  discriminate  between  friends  from  foe 
(Self  &  Miller,  2005).  The  draft  definitions  are  as  follows: 

•  Detection.  The  determination  that  an  object  or  location  in  the  field  of  view  may  be  of 
military  interest  such  that  the  military  observer  takes  an  action  to  look  closer:  alters  search 
in  progress,  changes  magnification,  selects  a  different  sensor,  or  cues  a  different  sensor. 

•  Classification.  The  object  is  distinguished  or  discriminated  by  class,  like  wheeled  or 
tracked,  human  or  other  animal.  Possibilities  are 

•  Recognition. 

o  For  vehicles  and  weapons  platforms,  the  object  can  be  distinguished  by  category 
within  a  class,  such  as  tank  or  personnel  carrier  in  the  class  of  tracked  vehicles. 
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o  For  humans,  the  perception  of  individual  elements,  a  combination,  or  a  lack  of, 
equipment,  hand-held  objects,  and/or  posture  that  can  be  distinguished  to  the 
extent  that  the  human  is  determined  to  be  of  special  military  interest. 

•  Identification. 

o  For  military  vehicles  and  weapons  systems,  the  object  is  distinguished  by  model, 
such  as  Ml  A2  or  T80. 

o  For  commercial  vehicles,  the  object  is  distinguished  by  typically  known  model 
types. 

o  For  humans,  the  perception  of  individual  elements  or  a  combination  of  elements, 
such  as  clothing,  equipment,  hand-held  objects,  posture,  and/or  gender  that  can  be 
distinguished  to  the  extent  that  the  human  is  determined  to  be  armed  or  potentially 
combatant. 

•  Feature  identification. 

o  Commercial  vehicles  can  be  distinguished  by  make  and  model. 

o  Individual  elements  of  clothing,  equipment,  hand-held  objects,  and/or  gender  can 
be  discriminated  by  name  or  country/region  of  origin 

Future  SIHS  fusion  studies  should  utilize  the  revised  frame  work  as  a  basis  for  developing  tasks 
and  subjective  performance  measures,  i.e.  properly  identifying  an  enemy  target  based  on  hand  held 
objects. 

Research  is  currently  underway  to  develop  adaptive  fusion  algorithms  which  adjust  their 
parameters  to  optimize  fusion  performance  (Piella,  2004).  This  approach  requires  objective 
measures  which  can  be  easily  computed  and  automated.  While  adaptive  fusion  algorithms  may  not 
be  available  for  utilization  by  SIHS,  the  approach  should  be  available  for  future  modernization 
programs. 

Lead  investigators  in  the  image  fusion  community  have  indicated  that  they  are  now  or  soon  will  be, 
investigating  task-specific  fusion  performance  and  the  characterization  of  video  fusion 
performance.  The  timing  of  the  proposed  fusion  study  by  SIHS  is  thus  occurring  at  an  opportune 
time. 
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6  Conclusion  and  Recommendations 


The  literature  search  was  conducted  to  support  SIHS  TD  Vision  SST  in  the  development  of  their 
fusion  test  bed.  The  goal  of  the  proposed  test  bed  would  be  to  help  facilitate  the  evaluation  of 
multispectral  image  fusion  with  potentially  helmet  portable  sensors.  The  specific  role  of  the  fusion 
test  bed  would  be  to  gather  registered  image  data  (source  and  fusion)  for  post  hoc  subjective  and 
objective  evaluations.  Specific  recommendations  of  sensors  to  acquire,  fusion  boards  to 
investigate,  fusion  algorithms  to  employ  and  evaluation  metrics  are  detailed  below.  Additionally 
suggestions  as  to  what  should  be  included  in  the  test  bed  are  described. 

6.1  Sensor  Hardware 

Image  fusion  combines  information  contained  in  multispectral  imagery  and  ultimately  enhances 
situational  awareness.  The  developments  in  sensor  hardware  have  made  sensor  fusion  a  reality  for 
defence  applications.  Varieties  of  sensor  types  were  reviewed  and  are  recommended  for  inclusion 
in  the  fusion  test  bed.  The  hope  to  positively  confirm  that  small,  relatively  light  weight,  640x480 
resolution  sensors  operating  at  a  minimum  of  30Hz  with  high  sensitivity  are  available  was  not 
realized.  Sensors  and  cameras  were  identified  that  meet  the  resolution  and  real-time  performance 
demands.  The  literature  review  identified  that  fusion  studies  utilized  the  following  sensors 

•  Day  camera; 

•  Night  camera  -  either  a  LLLTV,  ICMOS  sensor  or  EBAPS  sensor; 

•  NIR/SWIR  sensor; 

•  MWIR  (note  MWIR  was  not  utilized  in  man  portable  systems);  and 

•  LWIR  sensor. 

Day  Camera 

There  are  large  numbers  of  high  resolution  day  cameras  available  on  the  market  today  and  thus 
were  not  a  focus  of  this  search. 

Night  Camera 

A  large  number  of  capable  LLLTV  (ICCD  technology)  -  CCDs  tend  to  be  used  in  camera  (sensors) 
that  focus  on  high  quality  images  with  many  pixels.  CCD  sensor  light  sensitivity  is  typically 
greater  than  that  of  CMOS.  CMOS  sensors  usually  have  lower  image  quality  and  resolution. 
LLLTVs  tend  to  be  more  cost  effective  than  Infrared  cameras.  Prom  the  systems  reviewed, 
Intevac’s  E2006  camera  would  be  the  most  suitable  for  SIHS.  It  has  a  1280  x  1024  resolution,  30 
fps,  and  is  only  2”x  2”x  3”.  The  E2006  camera  functions  include  non-uniformity  correction, 
histogram  equalization  and  horizontal  image  orientation.  The  Pixelfly  by  PCO  Imaging  could  also 
be  a  suitable  candidate  for  SIHS.  It  has  a  smaller  resolution  at  640  x  480  with  a  higher  fps  at  50Hz 
and  is  only  1 .54”x  1 ,54”x  2.68”. 

ICMOS  -  ICMOS  cameras  are  less  expensive  and  they  generally  use  less  power  than  the  CCD 
technology.  It  has  a  superior  battery  life  over  the  CCD  sensors.  There  are  currently  not  many 
CMOS  sensors  that  would  meet  the  specifications  for  SIHS.  Many  of  these  cameras  are  used  in 
manufacturing  applications  and  commercial  digital  cameras.  Prom  the  seven  ICMOS  sensors 
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identified,  Intevac’s  E2010  would  be  suitable  for  SIHS.  The  E2010  was  designed  for  night 
surveillance  applications.  Similarly  to  the  E2006,  it  has  a  1280  x  1024  resolution,  30  fps,  and  is 
only  2”x  2”x  3”. 

EBAPS  -  Intevac  has  developed  the  proprietary  EBAPS  sensor.  They  were  developed  for 
commercial  security  camera  applications.  The  NightVisa  incorporates  non-uniformity  correction, 
bad  pixel  replacement,  and  histogram  equalization  image  processing  functions.  As  the  ISIE10  is 
under  development  the  Intevac  models  NightVista  and  IEIE6  seem  like  suitable  candidates  for 
SIHS.  More  information  would  be  helpful,  such  as  mechanical,  size,  and  electrical  specifications 
before  pursuing  the  incoiporation  of  these  sensors  into  SIHS.  The  frame  rate  of  all  EBAPS  did 
meet  the  desired  specification,  ranging  from  27.5  to  37  Hz. 

SWIR  Camera 

NIR/SWIR  -  There  were  no  SWIR  sensors  the  completely  satisfied  all  the  SIHS  requirements. 
From  the  10  SWIR  sensors  identified,  Sensors  Unlimited  cameras  were  the  most  suitable  for  SIHS. 
They  offer  four  miniature  cameras:  SU320M-1.7RT  InGaAs  SWIR  Camera,  NIR  MiniCamera,  and 
SU320MVis-1.7RT  Visible  and  SWIR.  However,  these  sensors  did  not  quite  meet  the  resolution 
specification  as  they  only  had  a  320  x  240  resolution  where  the  minimum  desired  was  640  x  480. 
The  other  Sensors  Unlimited  cameras  met  the  resolution  and  real  time  requirements,  however  were 
slightly  big  in  physical  size  at  6”x3”x3”.  Intevac’s  LIVAR  400  Short  Wave  Infrared  Camera 
would  be  the  most  suitable  for  SIHS  (for  resolution  and  real  time),  as  it  also  met  three 
specifications  but  was  missing  information  for  the  weight. 

MWIR/LWIR  Camera 

MWIR  and  LWIR  -  There  were  also  no  MWIR  and  LWIR  sensors  that  satisfied  all  the  SIHS 
requirements.  The  most  suitable  sensor  would  be  Irvine  Sensors’  Personal  Miniature  Thermal 
Viewer;  however  it  did  not  meet  the  resolution  requirement  as  its  resolution  was  only  320  x  240. 
Irvine’s  camera  uses  a  unique  shutterless  design,  and  there  are  custom  optical,  image  processing 
packaging  and  interface  options  available.  FLIRs  sensor’s  SC4000  and  SC6000  cameras  appear  to 
be  suitable.  The  FLIR  cameras  data  sheets  did  not  provide  any  information  on  size  or  weights; 
otherwise  they  satisfied  all  other  specifications. 

6.2  Fusion  Board 

The  goal  of  an  image  fusion  processor  is  to  gather  the  input  signals  and  to  fuse  the  signals  into  a 
single  output  using  a  specific  algorithm.  Some  image  fusion  processors  allow  the  user  to 
manipulate  the  algorithm  while  other  processors  prevent  the  manipulation  of  the  algorithm.  For  the 
purpose  on  an  open  source  desk  top  application  it  is  recommended  that  an  image  fusion  processor 
be  selected  that  can  support  several  inputs,  as  well  as,  providing  the  capability  of  utilizing  multiple 
algorithms.  The  Octec  ADEPT60  is  an  image  fusion  processor  that  meets  these  demands  and  is 
recommended. 

6.3  Fusion  Algorithms 

Image  fusion  algorithms  are  critical  to  the  image  fusion  process.  They  range  from  Multi-Scale 
Decomposition  techniques  which  break  down  the  input  images  into  lower  resolution  and  lower 
spatial  density  images  before  selecting  the  appropriate  characteristics  from  each  input  image  that 
are  used  for  the  fused  image.  There  are  also  Non  Multi-Scale  Decomposition  techniques  that  utilize 
statistical,  numerical,  and  artificial  neural  networks  theories  to  fuse  images  from  different  sources. 
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According  to  the  literature  reviewed  the  Shift-invariant  Discrete  Wavelet  Transform  receives  the 
most  favourable  results  in  both  subjective  and  objective  measures  for  night  vision  imagery.  The 
Laplacian  pyramid  scheme  also  receives  favourable  results.  Due  to  the  inconsistency  in  results 
present  in  the  literature  it  is  difficult  to  elect  one  algorithm  as  the  best.  Depending  on  the  measure 
used  to  evaluate  the  algorithm  it  can  lead  to  a  discrepancy  in  the  results.  Therefore,  it  is  important 
to  evaluate  all  the  algorithms  for  each  individual 

6.4  Evaluation  Metrics 

The  fusion  test  bed  will  be  used  to  collect  video  for  post-hoc  psychophysical  testing  (subjective). 
Overall  the  correlation  between  subjective  results  and  statistical  based  objective  performance 
measures  identified  in  the  literature  was  poor.  Recent  developments  in  metrics  based  upon 
perception-information  models  have  shown  promise.  The  improved  correlation  between  subjective 
and  objective  results  for  the  feature  and  information-based  metrics  suggests  that  the  following 
objective  metric  should  be  included  in  the  fusion  test  bed  analysis  system. 

•  Edge  Dependent  Fusion  Quality  Index  (QE); 

•  Fusion  Quality  Measure/Index  (Q); 

•  Weighted  Fusion  Quality  Index  (QW); 

•  Universal  Image  Quality  Index  (UIQI);  and 

•  Visual  Difference  (VDA. 
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6.5  Fusion  Test  Bed  System  Overview 

The  knowledge  gained  from  the  literature  review  will  help  support  the  Vision  SST  fusion  test  bed 
development.  Lessons  learned  have  been  captured  and  are  summarized  below  to  support  the  fusion 
test  bed  development.  It  should  be  noted  that  the  current  vision  of  the  fusion  test  bed  is  to  capture 
real  time  digital  video  images  for  post-hoc  fusion  assessment.  The  system  will  utilize  mains  power 
and  will  be  field  portable  but  not  ruggedized.  The  system  will  require  a  sensor  pod  containing 
multispectral  sensors  with  a  mounting  system  and  a  processing  pod.  Please  see  Figure  42  for  an 
example  of  the  sensor  pod  used  by  Flines  et  al.  (2005)  for  their  Enhanced  Vision  System  (EVS) 
using  the  Retinex  digital  image  enhancement  algorithm. 
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Figure  42:  Imaging  pod  from  Hines  et  al.  2005 

The  sensor  pod  could  be  physically  separated  from  the  processing  pod.  Fibre  optic  cables  could  be 
used  to  distribute  video  information  to  the  processing  pod  which  could  be  rack  mounted. 

The  test  bed  could  include  the  following  subsystems: 

o  Sensors.  The  fusion  test  bed  will  include  up  to  four  sensors  operating  in  LWIR,  SWIR, 
NIR  and  visible  bands.  The  identification  of  which  sensors  recommended  are  detailed  in 
the  following  section. 

o  Imaging  pod  (camera  rig).  System  to  align  multiple  sensors  as  closely  as  possible.  Initial 
requirements  are  detailed  in  a  following  section. 

o  Image  capture  system.  Digital  images  will  be  captured  in  real  time  (at  a  minimum  of 
30Flz).  Feeds  to  fusion  board/  frame  grabber/  the  Digital  Signal  Process  Board  (DSP)  will 
be  via  cable. 

o  Image  registration  system.  The  digital  image  sources  must  be  accurately  registered  for 
optimum  fusion  performance.  While  some  fusion  boards  will  do  this  automatically  other 
approaches  require  registration  marks. 

o  Image  processing  system.  Raw  images  may  need  pre-fusion  processing  to  optimize  fusion 
performance. 

o  Fusion  kemal.  Source  images  will  be  fused  using  selectable  algorithms.  Fusion  kernel  to 
also  output  selected  objective  fusion  metric  scores. 

o  User  graphical  interface  system 

o  Image  storage  system 
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o  Power  system 
o  Display  system 


Mutispectral  Imaging  Sensors: 

The  test  bed  should  be  able  to  collect  real  time  digital  video  from  up  to  four  sensors 
simultaneously.  The  sensors/cameras  could  be  any  of  the  following: 

o  Day  camera; 

o  Night  camera  -  either  a  LLLTV,  l2CMOS  sensor  or  EBAPS  sensor; 
o  NIR/SWIR  sensor; 

o  MWIR  (note  MWIR  was  not  utilized  in  man  portable  systems);  and 
o  LWIR  sensor. 

Mounting  Pod: 

A  mechanism  to  mount  up  to  four  sensors  will  be  required.  The  sensors  will  be  mounted  in  a  side 
by  side  stacked  configuration  to  reduce  parallax  errors  (minimum  offset  as  possible).  A 
mechanism  to  precisely  set  the  elevation  (pitch),  roll,  and  yaw  of  each  sensor  will  be  required. 
Additionally  the  sensors  should  be  mounted  on  a  mechanism  that  will  allow  for  quick  and  easy 
orientation  changes.  The  mounting  pod  should  be  able  to  attach  to  a  tripod  for  gross  adjustments. 

o  The  mounting  pod  must  be  able  to  permit  the  sensors  to  be  co-aligned  or  bore  sighted;  and 

o  While  every  effort  will  be  made  to  select  high  performance  but  miniaturized  sensors, 
COTS  systems  may  be  relatively  large.  One  approach  to  co-align  sensors  is  to  utilize 
dichroic  beam  splitters  -  see  Figure  43. 


Motorized 


Lincoln  Laboratory 
low-light  CCD 
(640  x  480  pixels) 

Dichroic  beam  splitter 


Uncoolod  LWIR 
(320  x  240  pixels) 


Figure  43:  Dichroic  beam  splitter  to  co-align  two  sensors-from  Waxman  et  al  (1998) 

Image  Registration  System 

An  image  registration  system  is  required  to  remove  lens  distortion  errors  and  to  remove  bore¬ 
sighting  inaccuracies.  Some  fusion  boards  will  utilize  their  own  registration  algorithms  based  on 
image  contours.  Other  systems  require  manual  image  registration. 

o  Propose  we  utilize  registration  markers  in  the  scenes  for  continuous  registration.  May  be 
able  to  simply  register  the  sensors  at  the  start  of  the  data  collection  period  and  at  the 
beginning  of  the  session; 
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o  Utilize  contour  based  registration  algorithm  or  multifactor  registration  system  to  align 
sensors.  The  system  would  do  the  following  image  feed  adjustments: 
o  Scaling, 
o  Translation, 
o  Rotation,  and 

o  (Match  images  to  a  common  grid): 

o  If  we  need  to  use  different  FOV  sensors  then  we  should  utilize  the  lowest  FOV  as  the 
baseline  for  registration.  The  higher  FOV  would  be  registered  to  the  lowest  using  affine 
transforms; 

o  If  possible  use  automatic  mapping  adjustment  algorithms  to  re-register  images 
periodically;  and 

o  Optical  differences  between  the  different  sensor  lenses  may  cause  minor  differences  in 
magnification  between  sensors  at  the  edges.  A  distortion  correction  board  should  reduce 
these  errors. 

Image  Processing  System 

While  registered  images  will  be  collected  by  the  fusion  test  bed,  it  is  proposed  that  the  fusion  test 
bed  be  designed  to  permit  real  time  image  processing  and  fusion.  Possible  real  time  image 
processing  to  include: 

o  Contrast  normalization; 
o  Adaptive  histogram  equalization; 
o  Adaptive  -  automatic  gain  and  level  functionality; 
o  Image  enhancement; 
o  Radiometric  transformation; 
o  Dynamic  range  compression; 
o  Colour  consistency; 
o  Colour  and  lightness  rendition; 
o  Noise  filtering;  and 

o  Intensity  stretching,  edge  shaipening,  haze  removal,  adaptive  smoothing,  isotropic 
smoothing 

Image  Fusion  System 

The  fusion  test  bed  will  include  a  fusion  kernel  to  implement  selected  fusion  algorithms. 

Imaging  Fusion  Module  (Algorithms) 

The  fusion  test  bed  should  permit  the  use  of  different  fusion  approaches.  Fusion  algorithms 
selected  for  testing  should  include  as  a  minimum  the  following: 

o  Shift-invariant  Discrete  Wavelet  Transform;  and 
o  The  Laplacian  pyramid  scheme. 

Fusion  Assessment  Module. 

The  fusion  test  bed  should  permit  the  evaluation  of  image  fusion  using  a  variety  of  objective 
metrics.  Possible  metrics  to  include: 

o  Universal  Image  quality  Metric  (UIQI); 
o  Edge  Dependent  Fusion  Quality  Index  (QE); 
o  Fusion  Quality  Measure/Index  (Q); 
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o  Weighted  Fusion  Quality  Index  (QW); 
o  Universal  Image  Quality  Index  (UIQI);  and 
o  Visual  Difference  (VDA). 

Image  Storage  System 

The  test  bed  will  require  the  capability  to  capture  the  following  images  and  information: 

o  Raw  sensor  images; 
o  Fused  images; 

o  Sensor  setup  and  parameter  adjustments;  and 
o  Sensor  registration. 

Graphical  User  Interface 

The  fusion  test  bed  will  require  a  graphical  user  interface  to  permit  the  operator  to  perform  the 
following  functions: 

o  Raw  image  processing; 

o  Fusion  algorithm  selection  and  parameter  adjustment; 
o  Parameter  setting  capture;  and 
o  Image  capture  controls, 

A  number  of  commercial  software  modules  are  available  to  support  image  fusion.  In  addition  to 
fusion  tools  a  number  of  image  processing  tools  and  software  programs  are  available.  Indigo 
systems  offers  a  number  of  radiometric  software  modules  to  acquire,  radiometrically  calibrate, 
analyze  and  document  data  from  digital  imaging  systems. 

Image  storage  system 

The  test  bed  will  store  the  captured  images  on  a  suitable  sized  hard  drive. 
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Performance  Specifications 

Sensor  Specifications 

Dimensions 

Ref# 

Manufacturer 

Product 

Monochrome  / 
Color 

Imaging 

Technology 

Horizontal 

Resolution 

(lines) 

Vertical 

Resolution 

(lines) 

Maximum 
Frame  Rate 
(fps) 

At  least  640  x 
480  and  30 
fps 

Shutter 

Speed 

(seconds) 

Sensitivity 

(Lux) 

Performance 

Format/ 

Output 

Resolution 

Lens 

Mount 

Shutter 

Control 

Quantum 

Efficiency 

(%) 

Fill  Factor 

(%) 

Width  / 
Diameter 
(inch) 

Height 

(inch) 

Length 

(inch) 

Weight 
(lb,  oz) 

Notes 

A1-1 

DVC 

DVC-1412  Series  - 
DVC-1412AM-MS 

Monochrome 

CCD 

1392 

1040 

100 

No 

9.30E- 
5  to  0.0980 

0.2368 

High  Speed, 
Gain 
Control, 
Cooled* 

RS644  / 
LVDS, 
FireWire, 
CameraLink 

1 2  bits 

C-Mount 

Electronic 

Shutter 

62 

100 

3.26 

4.47 

1.89 

1,2 

Fast  readout,  low  noise,  high  signal 
to  noise  ratio,  asynchronous  reset 

A1-2 

DVC 

DVC-1412  Series  - 
DVC-1412AM-MT 

Monochrome 

CCD 

1392 

1040 

100 

No 

9.30E- 
5  to  0.0980 

0.2368 

High  Speed, 
Gain 
Control, 
Cooled* 

RS644  / 
LVDS, 
FireWire, 
CameraLink 

1 2  bits 

C-Mount 

Electronic 

Shutter 

62 

100 

3.9 

4.79 

2.72 

1,2 

Fast  readout,  low  noise,  high  signal 
to  noise  ratio,  w/  cooler 

A1-3 

DVC 

DVC-1412  Series  - 
DVC-1412AC-00 

Color 

CCD 

1392 

1040 

100 

No 

9.30E- 
5  to  0.0980 

0.7535 

High  Speed, 
Gain  Control 

RS644  / 
LVDS, 
FireWire, 
CameraLink 

1 2  bits 

C-Mount 

Electronic 

Shutter 

62 

100 

3.25 

3.25 

1.73 

1,2 

Fast  readout,  low  noise,  high  signal 
to  noise  ratio,  asynchronous  reset 

A1-4 

DVC 

DVC-1412  Series  - 
DVC-1412AC-TE 

Color 

CCD 

1392 

1040 

100 

No 

9.30E- 
5  to  0.0980 

0.7535 

High  Speed, 
Gain 
Control, 
Cooled 

RS644  / 
LVDS, 
FireWire, 
CameraLink 

1 2  bits 

C-Mount 

Electronic 

Shutter 

62 

100 

3.9 

3.9 

2.57 

1,2 

Fast  readout,  low  noise,  high  signal 
to  noise  ratio,  asynchronous  reset 

A1-5 

Intevac 

E2006  Low  Light 
Level  Camera 

1280 

1024 

30 

Yes 

0.00001 

RS-232 

1 0  bits 

C-Mount 

2 

2 

3 

0,  8.4 

Camera  functions  include  non¬ 
uniformity  correction,  histogram 
equalization,  horizontal  image 
orientation 

A1-6 

NAC  Image  Technology 

HSV  High-Speed 
Color  Video  System 
-  HSV-500C3 

Color 

CCD 

510 

485 

500 

No 

1.00E- 
4  to  0.0020 

2500 

High  Speed 

NTSC, 

RS232 

24 

Bayonet 

Electronic 

Shutter, 

External 

Trigger 

2.99 

3.03 

5.59 

2,  0 

Ideal  for  military  testing  and 
monitoring,  and  biomechanical 
applications 

A1-7 

NAC  Image  Technology 

Memrecam  fx  - 
RX5  Micro  Camera 
Head 

Monochrome, 

Color 

CCD 

1280 

1024 

10000 

Yes 

?to  1.00E-6 

5000 

High  Speed 

Ethernet, 

Fibre 

Channel, 

VGA 

1 0  bits 

F-Mount, 

NF 

Electronic 

Shutter, 

External 

Trigger 

0.827 

0.827 

3.9 

0,  9.6 

Single  or  multi-camera  head  high 
speed  color  video  system 

A1-8 

PCO  Imaging 

PCO  Series  - 
Model  PCO. 1600 

Monochrome, 

Color 

CCD 

1600 

1200 

30 

No 

5.00E- 
7  to  4.23E6 

Cooled,  Anti- 
Blooming 

FireWire, 

CameraLink 

1 4  bits 

C-Mount, 

F-Mount 

Electronic 

Shutter 

55 

3.31 

2.6 

6.89 

4,  0 

Image  sensor  cooled 
thermoelectrically,  integrated  image 
memory 

A1-9 

PCO  Imaging 

PCO  Series  - 
Model  PCO. 2000 

Monochrome, 

Color 

CCD 

2048 

2048 

14.7 

No 

5.00E- 
7  to  4.23E6 

Cooled,  Anti- 
Blooming 

FireWire, 

CameraLink 

1 4  bits 

C-Mount, 

F-Mount 

Electronic 

Shutter 

55 

3.31 

2.6 

6.89 

4,  0 

Image  sensor  cooled 
thermoelectrically,  integrated  image 
memory 

A1-10 

PCO  Imaging 

PCO  Series  - 
Model  PC0.4000 

Monochrome, 

Color 

CCD 

4008 

2672 

5 

No 

5.00E- 
6  to  4.23E6 

Cooled,  Anti- 
Blooming 

FireWire, 

CameraLink 

1 4  bits 

C-Mount, 

F-Mount 

Electronic 

Shutter 

50 

3.31 

2.6 

6.89 

4,  3 

Image  sensor  cooled 
thermoelectrically,  integrated  image 
memory 

A1-11 

PCO  Imaging 

Pixelfly  Series  -- 
Model  Pixelfly 

Monochrome, 

Color 

CCD 

640 

480 

50/95/  177 

Yes 

5.00E- 
6  to  65.00 

Anti- 

Blooming 

RS644  / 
LVDS, 
Ethernet, 
RJ45 

Connector 

1 2  bits 

C-Mount 

Electronic 

Shutter 

43 

1.54 

1.54 

2.68 

0,  8.8 

Has  digital  temperature 
compensation  instead  of  thermo¬ 
electrical  cooling 

A1-12 

PCO  Imaging 

Pixelfly  Series  -- 
Model  Pixelfly  QE 

Monochrome, 

Color 

CCD 

1392 

1024 

23 

No 

5.00E- 
6  to  65.00 

Anti- 

Blooming 

RS644  / 
LVDS, 
Ethernet, 
RJ45 

Connector 

12  bits 

C-Mount 

Electronic 

Shutter 

62 

1.54 

1.54 

2.68 

0,  8.8 

Has  digital  temperature 
compensation  instead  of  thermo¬ 
electrical  cooling 

A1-13 

PCO  Imaging 

Sensicam  Series  - 
Model  Sensicam 

EM 

Monochrome, 

Color 

CCD, 

emCCD 

1004 

1002 

13 

No 

7.50E- 
5  to  3600 

Cooled,  Anti- 
Blooming 

Serial 

1 2  bits 

C-Mount, 

F-Mount 

Electronic 

Shutter 

65 

3.66 

3.07 

8.27 

3,  8 

Image  sensor  cooled 
thermoelectrically,  used  for  night 
vision  applications 

A1-14 

PCO  Imaging 

Sensicam  Series  - 
Model  Sensicam 

QE 

Monochrome, 

Color 

CCD 

1374 

1040 

19.8 

No 

5.00E- 
7  to  1000 

Cooled,  Anti- 
Blooming 

Serial 

1 2  bits 

C-Mount 

Electronic 

Shutter 

62 

3.66 

3.07 

8.27 

3,  8 

Image  sensor  cooled 
thermoelectrically,  used  for 
fluorescence  imaging 

A1-15 

PCO  Imaging 

Sensicam  Series  - 
Model  Sensicam 

QE  Double  Shutter 

Monochrome, 

Color 

CCD 

1376 

1040 

19.8 

No 

5.00E- 
7  to  3600 

Cooled,  Anti- 
Blooming 

Serial 

12  bits 

C-Mount 

Electronic 

Shutter 

62 

3.66 

3.07 

8.27 

3,  8 

Image  sensor  cooled 
thermoelectrically,  high  spectral 
sensitivity 
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Performance  Specifications 

Dimensions 

Ref  # 

Manufacturer 

Product 

Monochrome  / 
Color 

Specialty 

Camera 

Type 

Application  / 
Industry 

Horizontal 

Resolution 

(lines) 

Vertical 

Resolution 

(lines) 

Maximum 
Frame 
Rate  (fps) 

At  least 
640  x  480 
and  30  fps 

Shutter  Speed 
(seconds) 

Sensitiv 
ity  (Lux) 

Performance 

Format  / 
Output 

Resolution 

Lens 

Mount 

Shutter 

Control 

Quantum 

Efficiency 

(%) 

Width  / 
Diameter 
(inch) 

Height 

(inch) 

Length 

(inch) 

Weight 

(oz) 

Operating 

Temperature 

(F) 

Notes 

A2-1 

Intevac 

NightVista  E2010 

Surveillance 

applications 

1280 

1024 

30 

Yes 

10-4  to 
10+4 

1 0  bits 

C-Mount 

2 

2 

3 

8 

CMOS-based  day. night  video  camera  may  be  used 
for  day  or  night  surveillance 

A2-2 

Intevac 

NightVista  E3010 

plug  and 
play 

Surveillance 

applications 

1280 

1024 

30 

Yes 

Progressive 

scan 

“plug  and  play”  digital  image  intensifier  (DI2)  module 
specifically  designed  for  integration  into  imaging 
systems  such  as  head/helmet-mounted  displays, 
rifle  sights  and  small  EO/IR  surveillance  gimbals. 

A2-3 

Irvine  Sensors 
Corp. 

MVC-FF0229 

Monochrome 

Industrial, 

Security, 

T  raffic 
Control, 
Other 

752 

480 

60 

Yes 

10  bits 

Electronic 

global 

0.627 

1.34 

1.4 

40.00  to  185 

High  speed,  low  noise,  global  shuttered.  Byte-wide 
output. 

A2-4 

PCO  Imaging 

PCO  Series  -- 
Model  PCO.  1200 
HS 

Monochrome, 

Color 

High 

Speed 

Broadcast, 
Industrial, 
Scientific, 
Other,  High 
Speed 
Particle 
Image 
Velocimetry 

1280 

1024 

1357 

Yes 

5.00E-8  to  5.00 

Anti-Blooming 

FireWire, 

CameraLi 

nk 

1 0  bits 

C-Mount, 

F-Mount 

Electronic 

Shutter 

27 

3.31 

2.6 

6.89 

41.00  to  104 

High  speed,  low  noise,  has  fast  image  recording 
with  1  GB  per  second 

A2-5 

PCO  Imaging 

PCO  Series  -- 
Model  PCO.  1200 

S 

Monochrome, 

Color 

High 

Speed 

Broadcast, 

Industrial, 

Scientific, 

Other, 

Hydrodynami 
cs,  Fuel 
Injection 

1280 

1024 

1068 

Yes 

1.00E- 
6  to  1 .0000 

Anti-Blooming 

Ethernet, 

FireWire, 

CameraLi 

nk 

1 0  bits 

C-Mount, 

F-Mount 

Electronic 

Shutter 

25 

3.31 

2.6 

6.89 

41.00  to  104 

High  speed,,  has  fast  image  recording  with  1  GB 
per  second 

A2-6 

Prosilica  Inc. 

GE640C  -  02- 
2001 A 

Color 

High 

Speed, 

Vision 

Sensor 

Industrial, 

Security 

659 

493 

200 

Yes 

2.00E-5  to  5.00 

Progressive 
Scan,  Gamma 
Correction, 
Gain  Control, 
Anti-Blooming 

Ethernet 

10  bits 

C-Mount, 

CS- 

Mount* 

Electronic 

Shutter, 

External 

Trigger 

2 

1.5 

2.46 

?  to  122 

GigE  Vision,  gigabit  Ethernet,  high  speed,  global 
shutter,  VGA 

A2-7 

Vision  Research 

Inc 

High  Speed 
Camera  - 
Phantom®  v6.2 

Monochrome, 

Color 

High 

Speed 

512 

512 

1400 

No 

5.00E- 
6  to  1 .00E-5 

1200 

NTSC, 

PAL, 

RS232, 

Ethernet, 

SDI 

8  bits,  24 

C-Mount 

External 

Trigger 

3 

3 

2 

Multi-head  high  speed  digital  imaging  system 
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Ref# 

Manufacturer 

Product 

Length 

(inch) 

Width  / 
Diameter 
(inch) 

Height 

(inch) 

FOV  (degrees) 

Weight  (lbs, 
oz) 

Detector 

resolution 

(pixels) 

Video  refresh 
rate  (Hz) 

At  least  640  x 
480  and  30  fps 

Pitch  (micron) 

Detector  Type 

Thermal 

sensitivity 

(mK) 

Spectral 

Response 

(microns) 

Notes 

A3-1 

FUR 

SC4000  HS-NIR/  SC6000  HS-NIR 

depends  on  lens: 
range  5.5  x  4.4 
to  63.2  x  52.4 

320  x  256  / 

640x512 

programmable 
1-420/ 1-126 

Yes 

Indium  Gallium 
Arsenide 
(InGaAs) 

0.9 -1.7 

New  standard  for  military  themal  imaging.  Available  in 
various  wavebands  across  the  spectrum. 

A3-2 

Intevac 

LIVAR  400  Short  Wave  Infrared  Camera 

2.5 

2.58 

2.8 

640  x  480 

upto  28.5 

Yes 

EBCMOS 

0.95-1.55 

Long  range  target  identification  for  airborne,  ground, 
marintime  and  dismounted  platforms  beyond  20  km). 
Compliments  cooled  and  uncooled  FLIR  detection 
devices  and  provides  long  range  high  resolougtion 
target  ID.  Perfect  for  covert  operators  and  compact 
systems. 

A3-3 

Sensors  Unlimited  Inc  (Goodrich) 

SU640SDV-1 .7  RT 

SU640SDV  Vis-1 .7  RT  High  Resolution 
InGaAs  and  Vis-InGaAs  SWIR  Area 

Cameras 

6.22 

3 

3 

~2,  6 

640 x  512 

30 

Yes 

25 

InGaAs,  CMOS 
readout 

0.4- 1.7  or  0.9 
- 1.7 

A3-4 

Sensors  Unlimited  Inc  (Goodrich) 

SU640SDWH-1 .7  RT 

SU640SDWHVIS-1 .7  RT  High  Resolution 
Windowing  InGaAs  and  Vis-InGaAs  SWIR 
Area  Cameras 

6.22 

3 

3 

~2,  6 

640 x  512 

109 

Yes 

25 

InGaAs 

0.4- 1.7  or  0.9 
- 1.7 

A3-5 

Sensors  Unlimited  Inc  (Goodrich) 

SU320KTX-1 .7RT  High  Sensitivity  InGaAs 
SWIR  Camera 

1.64 

1.5 

1.5 

~3,  2 

320  x  240 

60 

No 

40 

InGaAs 

0.9- 1.7 

A3-6 

Sensors  Unlimited  Inc  (Goodrich) 

SU320M-1 .7RT  InGaAs  SWIR  Camera 

1.96 

2.36 

3.74 

<11  oz 

320  x  240 

50-60 

No 

40 

InGaAs 

05 

O 

A3-7 

Sensors  Unlimited  Inc  (Goodrich) 

SU320MX-1 .7RT  High  Sensitivity  InGaAs 

NIR  MiniCamera 

1.96 

2.36 

3.74 

<11  oz 

320  x  240 

25  -30 

No 

40 

InGaAs 

0.9-  1.7 

A3-8 

Sensors  Unlimited  Inc  (Goodrich) 

SU320MVis-1 .7RT  Visible  and  SWIR 
Response  InGaAs  NIR  MiniCamera 

1.96 

2.36 

3.74 

<11  oz 

320  x  240 

50  -60 

No 

40 

InGaAs 

0.4-  1.7 

A3-9 

Sensors  Unlimited  Inc  (Goodrich) 

SU320MSVis-1 .7RT  Visible  and  SWIR 
Response  InGaAs  MiniCamera 

1.96 

2.36 

3.74 

<11  oz 

320  x  256 

25  -30 

No 

25 

InGaAs 

0.4-  1.7 

A3- 10 

Lumitron 

NIR320  OEM  InGaAs  Near-Infrared  Camera 

5.25 

3 

3 

depends  on  lens: 
range  3.9  x  3.1 
to  21  x  16.8 

2,8 

320  x  256 

60 

No 

30 

InGaAs 

0.9-  1.7 

Humansystems  Incorporated 
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Ref# 

Manufacturer 

Product 

Length 

(inch) 

Width  / 
Diameter 
(inch) 

Height 

(inch) 

FOV 

(degrees) 

Weight  (lbs, 
oz) 

Detector 

resolution 

(pixels) 

Video  refresh 
rate  (Hz) 

At  least  640  x 
480  and  30  fps 

Pitch  (micron) 

Detector  Type 

Thermal 

sensitivity 

(mK) 

Spectral 

Response 

(microns) 

Display  polarity 

Time  to 
operation 
(seconds) 

Power  source 

Operating  time 
(hours) 

Detection 

Range 

(metres) 

Diopter 

Adjustment 

(diopters) 

Interpupillary 

Adjustment 

(mm) 

System 

Magnification 

A4-1 

DRS  NVEC 

UTWS  (Urban  Thermal 
Weapon  Sight  II) 

12.1 

2.6 

3.8 

18x  13.5 

2 

320  x  240 

60 

No 

25 

Vox 

microbolometer 

<50 

8-12  (LWIR) 

black  hot/  white 
hot  gray  scale, 
green  scale,  red 
scale 

4  typical 

4  AA  batteries 

10 

550 

A4-2 

DRS  NVEC 

CSTWS  (Crew  Served 
Thermal  Weapon  Sight  II) 

16 

4.25 

5.5 

9  x  6.9  (e- 
zoom  3  x 
2.3) 

4,  2 

640  x  480 

30 

Yes 

25 

Vox 

microbolometer 

<50 

8-12  (LWIR) 

black  hot/  white 
hot  gray  scale, 
green  scale,  red 
scale 

4 

6  AA  batteries 

18 

2200 

A4-3 

DRS  NVEC 

MX-2  (Rugged  miniature 
thermal  imager) 

8.5 

5.5 

3 

18  x  13.5 

2,  13 

320  x  240 

60 

No 

28 

Vox 

microbolometer 

<70 

8-12  (LWIR) 

white-hot/  black- 
hot 

<12 

cassette  of  6 
AA  batteries 

7.5 

530 

A4-4 

DRS  NVEC 

COBRA-IR  (Covert  Over 
Barrier  Recon  Assistant  - 
Infrared) 

2.5 

4 

9.5 

36x27 

1,  14 

320  x  240 

30 

No 

38 

Vox 

microbolometer 

=80 

8-12  (LWIR) 

white-hot 

<3  typical 

3  AA  batteries 

>4 

up  to  500 

A4-5 

DRS  NVEC 

HelmetIR 

3 

3 

3 

17x  12 

1,3 

160 x  120 

20 

No 

47 

Amorphous 

silicon 

microbolometer 

=100 

8-12  (LWIR) 

white-hot 

5  typical 

2  AA  batteries 

7+ 

320 

A4-6 

DRS  NVEC 

Mini-IR  Plus  (hand-held 
thermal  imager) 

5.25 

4.5 

2 

11x8 

160 x 120 

30 

No 

30 

Amorphous 

silicon 

microbolometer 

=60 

8-12  (LWIR) 

white-hot 

<5  typical 

2  AA  batteries 

=4 

upto  450 

A4-7 

DRS  NVEC 

PVS-7  Style  (Single  tube 
NVG) 

6 

8.25 

3.5 

40 

2  AA  batteries 

55 

-6  to  +2 

15 

1  x  (3x  with 
afocal  lens) 

A4-8 

DRS  NVEC 

MANTIS  (Multi-Adaptable 
Night  Tactical  Imaging 
System) 

3 

2 

2.5 

15.8  oz 

2  AA  batteries 

40-60 

-6  to  +2 

lx 

A4-9 

DRS  NVEC 

4x  Raptor  (4-power  night 
vision  weapon  sight) 

12 

3.75 

3.5 

8.3 

3,  6 

2  AA  batteries 

40 

-5  to  +  2 

30 

4x 

A4-10 

DRS  NVEC 

6x  Raptor  (6-power  night 
vision  weapon  sight) 

14 

4.5 

4.5 

5.7 

5,  8 

2  AA  batteries 

40 

-5  to  +  2 

30 

6x 

A4-11 

ELCAN 

(Raytheon) 

SpecterlR+ 

9x7 

<3,  0 

320  x  240 

8 -12  (LWIR) 

3  AA  batteries 

>4 

750 

2x 

A4-12 

FUR 

ThermoVision  Photon 

2.1 

2 

1.8 

47x35 

~4oz 

320  x  240 

microbolometer 

7.5-13.5 

(LWIR) 

2x 

A4-13 

FUR 

ThermoVision  ThermoSight 

10 

15.5x9.9 

-1,3 

7.5-13.5 

(LWIR) 

4  AA  batteries 

2.5-7 

A4-14 

FUR 

ThermoVision  A10 

1.45 

1.35 

1.45 

160 x 120 

Uncooled 

microbolometer 

7.5-13.5 

(LWIR) 

2  max 

A4-15 

FUR 

SC4000  HS  MWIR/ 

SC6000  HS-MWIR 

depends  on 
lens:  range 

1 1  x  9  to  62 

x  51 

320  x  256 / 
640x512 

programmable 
1-420/ 1-  126 

Yes 

25 

Indium 

Antimonide 

(InSb) 

<25(18 

typical) 

3.0  -5.0 
(MWIR) 

A4-16 

FUR 

SC4000  HS  LWIR/  SC6000 
HS-LWIR 

depends  on 
lens:  range 
5.5  x  4.4  to 
63.2  x  52.4 

320  x  256 / 
640x512 

programmable 
1-420/ 1-  127 

Yes 

25 

Gallium 

Arsenide  (GaAs) 
QWIP 

<35 

8.0  -9.2 
(LWIR) 

A4-17 

Irvine  Sensors 

Personal  Miniature  Thermal 
Viewer 

4 

1.8 

3 

20  or  40 

<  12  oz 

320  x  240 

60 

No 

<50 

LWIR 

<0.4  sec 

2  AA  batteries 

>5 

A4-18 

Irvine  Sensors 

CAM-NOIR  Thermal 

Camera 

320  x  240 

No 

A4-19 

Irvine  Sensors 

Miniature  Camera 

from  320  x 
240  to  1280  x 

1024 

4  AA  batteries 

A4-20 

L3 

Communications 

Thermal-Eye 

X200xp 

5.25 

4.5 

2 

11  x8 

13  oz 

160 x 120 

30 

No 

30 

amorphous 

silicon 

microbolometer 

=50 

7-14  (LWIR) 

white  =hot,  black= 
cold 

~3 

2  AA  batteries 

2-6 

450 

A4-21 

L3 

Communications 

Thermal-Eye 

3600AS 

1.79 

1-1.3 

2.5 

50x37 

2.38 

160 x 120 

30 

No 

30 

amorphous 

silicon 

microbolometer 

<50 

7-14  (LWIR) 

~2.4 

A4-22 

L3 

Communications 

Thermal-Eye 

3620AS 

1.79 

1-1.3 

2.5 

11  x  8,  17 
x12,  or  32 
x24 

2.38 

160 x  120 

30 

No 

30 

amorphous 

silicon 

microbolometer 

<50 

7-14  (LWIR) 

~2.4 

A4-23 

L3 

Communications 

Thermal-Eye 

3640AS 

1.79 

1-1.3 

2.5 

25  x  18 

2.38 

160 x  120 

30 

No 

30 

amorphous 

silicon 

microbolometer 

<50 

7-14  (LWIR) 

~2.4 

A4-24 

Lumitron 

UC320U  OEM 
Microbolometer  Infrared 
Camera 

5.5 

3 

3 

4.1  x  3.2  to 
25  x  19 

2,  0 

320  x  240 

60 

No 

35 

microbolometer 

<60 

8-14  (LWIR) 

A4-25 

Lumitron 

UC320D  OEM 
Microbolometer  Infrared 
Camera 

7 

3 

3 

4.5  x  3.5  to 
69x53 

2,  0 

320  x  240 

60 

No 

51 

microbolometer 

<60 

8-14  (LWIR) 
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Ref# 

Manufacturer 

Product 

Notes 

A4-1 

DRS  NVEC 

UTWS  (Urban  Thermal 
Weapon  Sight  II) 

Used  fo  day/night  reconnaissance,  surveillance  and  target  acquisition  for  individual 
and  crew  weapons.  The  narrow  FOV  is  designed  for  distance  target  detection  and 
recognition.  Lightweight  (2  lbs). 

A4-2 

DRS  NVEC 

CSTWS  (Crew  Served 
Thermal  Weapon  Sight  II) 

Used  fo  day/night  reconnaissance,  surveillance  and  target  acquisition  for  individual 
and  crew  weapons.  3  x  digital  zoom. 

A4-3 

DRS  NVEC 

MX-2  (Rugged  miniature 
thermal  imager) 

High  performance  multipurpose,  tactical  hand-held  thermal  imager,  tripod,  or 
weapon  mounted  employement.  Removeable  eyepiece  for  remote  display  (helmet) 
mountable. 

A4-4 

DRS  NVEC 

COBRA-IR  (Covert  Over 
Barrier  Recon  Assistant  - 
Infrared) 

Compact  tactical  thermal  and  video  periscope.  Can  be  used  as  a  hand-held  or 
tripod-mounted  covert  camera  system. 

A4-5 

DRS  NVEC 

FlelmetIR 

This  thermal  imager  can  see  in  total  darkness,  through  battlefield  obscurants  and 
foliage.  Flexible  monocluar  eyepiece  with  flip-up  capability. 

A4-6 

DRS  NVEC 

Mini-IR  Plus  (hand-held 
thermal  imager) 

Small  enough  to  store  in  BDU  pocket. 

A4-7 

DRS  NVEC 

PVS-7  Style  (Single  tube 
NVG) 

Designed  for  US  ground  forces.  Veratile  system  that  delivers  exceptional  gain  and 
resolution  on  the  darkets  nights.  Quick  release  for  one  hand  mounting. 

A4-8 

DRS  NVEC 

MANTIS  (Multi-Adaptable 
Night  Tactical  Imaging 
System) 

Can  be  hand-held  for  direct  observation  or  weapon  mounted  for  accurate  night 
targeting. 

A4-9 

DRS  NVEC 

4x  Raptor  (4-power  night 
vision  weapon  sight) 

Is  the  most  accurate  night  imaging  device  designed  to  meet  military  requirements 
for  long  range  accuracy. 

A4-10 

DRS  NVEC 

6x  Raptor  (6-power  night 
vision  weapon  sight) 

Is  the  most  accurate  night  imaging  device  designed  to  meet  military  requirements 
for  long  range  accuracy. 

A4-11 

ELCAN 

(Raytheon) 

SpecterlR+ 

Lightweight,  rugged,  thermal  weapon  sight.  Operational  in  sand,  smoke,  fog. 
Uncooled  detector  technology. 

A4-12 

FUR 

ThermoVision  Photon 

Long  wave  Tl  get  clear  imagery  in  total  darkness,  through  smoke,  fog,  and  most 
obscurants 

A4-13 

FUR 

ThermoVision  ThermoSight 

Long  wave  Tl,  can  be  used  as  hand  held  for  scouting,  survellance,  and  covert 
operations,  small  and  lightweight,  electronic  bore  sighting 

A4-14 

FUR 

ThermoVision  A10 

World's  smalledst  infrared  camera,  high  sensitivity  to  detection,  modular,  highly 
flexible  architecture  supports  a  wide  range  of  features,  options  and  accessories. 

A4-15 

FUR 

SC4000  HS  MWIR/ 

SC6000  HS-MWIR 

New  standard  for  military  themal  imaging.  Available  in  various  wavebands  across 
the  spectrum. 

A4-16 

FUR 

SC4000  HS  LWIR/  SC6000 
HS-LWIR 

New  standard  for  military  themal  imaging.  Available  in  various  wavebands  across 
the  spectrum. 

A4-17 

Irvine  Sensors 

Personal  Miniature  Thermal 
Viewer 

Unique  shutterless  deisng,  custom  optical,  image  processing  packaging  and 
interface  options  available 

A4-18 

Irvine  Sensors 

CAM-NOIR  Thermal 

Camera 

Can  operate  over  a  broad  temperature  range  without  the  need  for  temp 
stabilzation.  "Instant  on"  capability. 

A4-19 

Irvine  Sensors 

Miniature  Camera 

IR  or  RF  control  and  data  links.  Remote  shitter  trigger  input.  Interchangeable  lens 
system 

A4-20 

L3 

Communications 

Thermal-Eye 

X200xp 

For  target  detection,  force  protection,  routine  patrols,  search  and  rescue,  distibuted 
surface/IED  detection,  covert  surveillance,  and  fugitive  pursuit. 

A4-21 

L3 

Communications 

Thermal-Eye 

3600AS 

Small  size  and  best-in-class  power  consumption,  open  architecture  (easy  access  to 
video  processing  chain  with  sophisticated  GUI) 

A4-22 

L3 

Communications 

Thermal-Eye 

3620AS 

Small  size  and  best-in-class  power  consumption,  open  architecture  (easy  access  to 
video  processing  chain  with  sophisticated  GUI) 

A4-23 

L3 

Communications 

Thermal-Eye 

3640AS 

Small  size  and  best-in-class  power  consumption,  open  architecture  (easy  access  to 
video  processing  chain  with  sophisticated  GUI) 

A4-24 

Lumitron 

UC320U  OEM 
Microbolometer  Infrared 
Camera 

A4-25 

Lumitron 

UC320D  OEM 
Microbolometer  Infrared 
Camera 
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Dimensions 

Ref# 

Manufacture 

r 

Product 

Detector 

resolution 

(pixels) 

Pixel  size 
(microns) 

Detector 

Type 

Spectral 

Response 

(microns) 

Video 
refresh 
rate  (Hz) 

Video  Output 

At  least  640 
x  480  and 
30  fps 

Width 

(inch) 

Height 

(inch) 

Length 

(inch) 

Notes 

A5-1 

Intevac 

NightVista 

640  x  480 

12  x  12 

GaAs 

5-9 

30 

RS-170  or 
interlaced 
digital  video 

Yes 

Developed  for  commercial  security  camera 
applications.  Incorporates  non-uniformilty 
correction,  bad  pixel  replacement,  and 
histogram  equalization  image  processing 
functions. 

A5-2 

Intevac 

ISIE6 

1280  x  1024 

6.7  x  6.7 

GaAs 

5-9 

-27.5 

10  bit  digital 
output, 
progressive 
scan 

Yes 

A5-3 

Intevac 

ISIE10 

1280  x  1024 

10.8x10.8 

GaAs 

5-9 

37 

10  bit  digital 
output, 
progressive 
scan 

Yes 

In  development  stages. 
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Annex  B:  Image  Sensor  Fusion  Boards 


HUMANS  y  STEMS 


Fusion 

Hardware 

Size 

Weight 

Chip  or 
board 

#  of 
inputs 

Processing 

speed 

Open 

arch 

Real¬ 

time 

Power 

Equinox  DVP- 
4000 

3’’  x  3” 

2oz 

Chip 

2 

60  fps 

YES 

1.5  W 

Consumption 

Imagize  FP- 
3500 

1.4”  x1.4”x 
0.5” 

0.75  oz 

2  board 
system 

2 

30  fps,  60 
fields/s 

YES 

0.6  W  30  fps  0.9 

W-  60  fps 

Irvine 

VIP/Balboa 

20 

Processors 

40  MHz 

EPIXPIXCI- 

D2X 

4.913”  x 

4.2” 

3.3  V  or  5  V  PCI 
Signaling 

Octec 

ADEPT60 

233.4mm  x 
160mm 

Board 

2  Video 
Inputs 

YES 

5  V 

Acadia  1  PCI 
Vision  Board 

Single  Chip 

2 

30  fps,  60 
field/s 

YES 

15  W 

VMETRO 

PMC-FPGA03 

2 
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